|
#457
-
Abu-Rasheed 2024
Knowledge Graphs as Context Sources for LLM-Based Explanations of Learning Recommendations
15th IEEE Global Engineering Education Conference (IEEE EDUCON) 2024;(): Greece Ieee 2024 DOI: 10.1109/educon60312.2024.10578654 · Ref ID: 3157 In the era of personalized education, the provision of comprehensible explanations for learning recommendations is of great value to enhance the learner's understanding and engagement with the recommended learning content. Large language models (LLMs) and generative AI have recently opened new doors for generating human-like explanations, for and along learning recommendations. However, their precision is still far away from acceptable in a sensitive field like education. To harness the abilities of LLMs, while still ensuring a high level of precision towards the intent of the learners, this paper proposes an approach to utilize knowledge graphs (KG) as a source of factual context for LLM prompts, reducing the risk of model hallucinations, and safeguarding against wrong or imprecise information, while maintaining an application-intended learning context. We utilize the semantic relations in the knowledge graph to offer curated knowledge about learning recommendations. With domain-experts in the loop, we design the explanation as a textual template, which is filled and completed by the LLM. Domain experts were integrated in the prompt engineering phase as part of a study, to ensure that explanations include information that is relevant to the learner. We evaluate our approach quantitatively using Rouge-N and Rouge-L measures, as well as qualitatively with experts and learners. Our results show an enhanced recall and precision of the generated explanations compared to those generated solely by the GPT model, with a greatly reduced risk of generating imprecise information in the final learning explanation. |
Davis
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2469
-
Abuosba 2015
Formalizing big data processing lifecycles: Acquisition, serialization, aggregation, analysis, mining, knowledge representation, and information dissemination
2015 International Conference and Workshop on Computing and Communication (IEMCON) 2015;():1-4 2015 DOI: 10.1109/IEMCON.2015.7344533 · Ref ID: 6054 In today's e-Business environment, ERP, CRM, collaboration tools, and networked sensors may be characterized as data generators resources. Business Intelligence (BI) is a term that incorporates a range of analytical and decision support applications in business including data mining, decision support systems, knowledge management systems, and online analytical processing; processing data within these systems produce new data that are characterized to grow rapidly causing limitation problem of data management if handled by a Relational Database Management System (RDBMS) or statistical tools. Collectively these structured and unstructured data are referred to as Big Data. Successful and efficient handling of Big Data requires deployment of specific IT infrastructure components as well as adopting an emerging service model. In this research we introduce a conceptual model that abstracts the processing scheme of big data processing lifecycle. The model addresses the main phases of the lifecycle: data acquisition, data serialization, data aggregation, data analysis, data mining, knowledge representation, and information dissemination. The model is driven by projecting Service Oriented Architecture attributes to the building block of the lifecycle and adhering to the Lifecycle Modeling Language specification. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#335
-
Addad 2024
Homeopathic Poisoning of RAG Systems
43rd International Conference on Computer Safety, Reliability and Security (SAFECOMP) 2024;14989():358-364 Florence, ITALY Springer International Publishing Ag 2024 DOI: 10.1007/978-3-031-68738-9_28 · Ref ID: 3762 Despite their remarkable success and wide use in many applications, large language models (LLMs) are not free from intrinsic vulnerabilities (e.g. prompt injection). They may also suffer from hallucinations and drop in performance due to lack of up-to-date knowledge. Retrieval-Augmented Generation (RAG) is currently one of the most promising techniques to mitigate such issues. In short, a RAG augments each prompt using a relevant context from an external knowledge database. Usually, the context is composed of texts that are the most similar to the request. While reducing hallucinations, RAG augments at the same time the attack surface of the whole system. Indeed, an attacker may poison the knowledge database by injecting bad or misleading information. In this paper, we introduce HOPRAG, a subtle, but very efficient, poisoning technique that consists in adding a suffix (or prefix) of only few tokens (sub-words) to any given text to raise (or decrease) its similarity with a prompt and therefore be used (or avoid being used) as context by RAG to answer. Our results show that with only three injected tokens, we manage to perform a successful attack. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#625
-
Afshar 2024
On the role of the UMLS in supporting diagnosis generation proposed by Large Language Models
Objective: Traditional knowledge-based and machine learning diagnostic decision support systems have benefited from integrating the medical domain knowledge encoded in the Unified Medical Language System (UMLS). The emergence of Large Language Models (LLMs) to supplant traditional systems poses questions of the quality and extent of the medical knowledge in the models' internal knowledge representations and the need for external knowledge sources. The objective of this study is three-fold: to probe the diagnosis-related medical knowledge of popular LLMs, to examine the benefit of providing the UMLS knowledge to LLMs (grounding the diagnosis predictions), and to evaluate the correlations between human judgments and the UMLS-based metrics for generations by LLMs. Methods: We evaluated diagnoses generated by LLMs from consumer health questions and daily care notes in the electronic health records using the ConsumerQA and Problem Summarization datasets. Probing LLMs for the UMLS knowledge was performed by prompting the LLM to complete the diagnosis-related UMLS knowledge paths. Grounding the predictions was examined in an approach that integrated the UMLS graph paths and clinical notes in prompting the LLMs. The results were compared to prompting without the UMLS paths. The final experiments examined the alignment of different evaluation metrics, UMLS-based and non-UMLS, with human expert evaluation. Results: In probing the UMLS knowledge, GPT-3.5 significantly outperformed Llama2 and a simple baseline yielding an F1 score of 10.9% in completing one-hop UMLS paths for a given concept. Grounding diagnosis predictions with the UMLS paths improved the results for both models on both tasks, with the highest improvement (4%) in SapBERT score. There was a weak correlation between the widely used evaluation metrics (ROUGE and SapBERT) and human judgments. Conclusion: We found that while popular LLMs contain some medical knowledge in their internal representations, augmentation with the UMLS knowledge provides performance gains around diagnosis generation. The UMLS needs to be tailored for the task to improve the LLMs predictions. Finding evaluation metrics that are aligned with human judgments better than the traditional ROUGE and BERT-based scores remains an open research question. |
Davis
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#429
-
Agarwal 2021
Knowledge Graph Based Synthetic Corpus Generation for Knowledge-Enhanced Language Model Pre-training
Conference of the North-American-Chapter of the Association-for-Computational-Linguistics - Human Language Technologies (NAACL-HLT) 2021;():3554-3565 Electr Network Assoc Computational Linguistics-Acl 2021 Ref ID: 2979 Prior work on Data-To-Text Generation, the task of converting knowledge graph (KG) triples into natural text, focused on domain-specific benchmark datasets. In this paper, however, we verbalize the entire English Wiki-data KG, and discuss the unique challenges associated with a broad, open-domain, large-scale verbalization. We further show that verbalizing a comprehensive, encyclopedic KG like Wiki-data can be used to integrate structured KGs and natural language corpora. In contrast to the many architectures that have been developed to integrate these two sources, our approach converts the KG into natural text, allowing it to be seamlessly integrated into existing language models. It carries the further advantages of improved factual accuracy and reduced toxicity in the resulting language model. We evaluate this approach by augmenting the retrieval corpus in a retrieval language model and showing significant improvements on the knowledge intensive tasks of open domain QA and the LAMA knowledge probe. |
Kwesi
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#107
-
Agrawal 2023
CLIPGraphs: Multimodal Graph Networks to Infer Object-Room Affinities
32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN) 2023;():2604-2609 Busan, SOUTH KOREA Ieee 2023 DOI: 10.1109/ro-man57019.2023.10309325 · Ref ID: 3496 This paper introduces a novel method for determining the best room to place an object in, for embodied scene rearrangement. While state-of-the-art approaches rely on large language models (LLMs) or reinforcement learned (RL) policies for this task, our approach, CLIPGraphs, efficiently combines commonsense domain knowledge, data-driven methods, and recent advances in multimodal learning. Specifically, it (a) encodes a knowledge graph of prior human preferences about the room location of different objects in home environments, (b) incorporates vision-language features to support multimodal queries based on images or text, and (c) uses a graph network to learn object-room affinities based on embeddings of the prior knowledge and the vision-language features. We demonstrate that our approach provides better estimates of the most appropriate location of objects from a benchmark set of object categories in comparison with state-of-the-art baselines. (1) |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1296
-
Ahmed 2023
Explainable Integration of Knowledge Graphs Using Large Language Models
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2023;13913 LNCS():124-139 Springer Science and Business Media Deutschland GmbH 2023 DOI: 10.1007/978-3-031-35320-8_9 · Ref ID: 5252 Linked knowledge graphs build the backbone of many data-driven applications such as search engines, conversational agents and e-commerce solutions. Declarative link discovery frameworks use complex link specifications to express the conditions under which a link between two resources can be deemed to exist. However, understanding such complex link specifications is a challenging task for non-expert users of link discovery frameworks. In this paper, we address this drawback by devising NMV-LS, a language model-based verbalization approach for translating complex link specifications into natural language. NMV-LS relies on the results of rule-based link specification verbalization to apply continuous training on T5, a large language model based on the Transformer architecture. We evaluated NMV-LS on English and German datasets using well-known machine translation metrics such as BLUE, METEOR, ChrF++ and TER. Our results suggest that our approach achieves a verbalization performance close to that of humans and outperforms state of the art approaches. Our source code and datasets are publicly available at https://github.com/dice-group/NMV-LS. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG. |
Davis
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3731
-
Ahn 2016
A Neural Knowledge Language Model
arXiv 2016;(): 2016 Ref ID: 7353 Current language models have a significant limitation in the ability to encode and decode factual knowledge. This is mainly because they acquire such knowledge from statistical co-occurrences although most of the knowledge words are rarely observed. In this paper, we propose a Neural Knowledge Language Model (NKLM) which combines symbolic knowledge provided by the knowledge graph with the RNN language model. By predicting whether the word to generate has an underlying fact or not, the model can generate such knowledge-related words by copying from the description of the predicted fact. In experiments, we show that the NKLM significantly improves the performance while generating a much smaller number of unknown words. |
Davis
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3791
-
Ahrabian 2023
PubGraph: A Large-Scale Scientific Knowledge Graph
arXiv 2023;(): 2023 Ref ID: 7643 Research publications are the primary vehicle for sharing scientific progress in the form of new discoveries, methods, techniques, and insights. Unfortunately, the lack of a large-scale, comprehensive, and easy-to-use resource capturing the myriad relationships between publications, their authors, and venues presents a barrier to applications for gaining a deeper understanding of science. In this paper, we present PubGraph, a new resource for studying scientific progress that takes the form of a large-scale knowledge graph (KG) with more than 385M entities, 13B main edges, and 1.5B qualifier edges. PubGraph is comprehensive and unifies data from various sources, including Wikidata, OpenAlex, and Semantic Scholar, using the Wikidata ontology. Beyond the metadata available from these sources, PubGraph includes outputs from auxiliary community detection algorithms and large language models. To further support studies on reasoning over scientific networks, we create several large-scale benchmarks extracted from PubGraph for the core task of knowledge graph completion (KGC). These benchmarks present many challenges for knowledge graph embedding models, including an adversarial community-based KGC evaluation setting, zero-shot inductive learning, and large-scale learning. All of the aforementioned resources are accessible at https://pubgraph.isi.edu/ and released under the CC-BY-SA license. We plan to update PubGraph quarterly to accommodate the release of new publications. |
Davis
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1782
-
Akbacak 2014
Rapidly building domain-specific entity-centric language models using semantic web knowledge sources
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2014;():2872-2876 International Speech and Communication Association 2014 Ref ID: 5806 For domain-specific speech recognition tasks, it is best if the statistical language model component is trained with text data that is content-wise and style-wise similar to the targeted domain for which the application is built. For state-of-the-art language modeling techniques that can be used in real-time within speech recognition engines during first-pass decoding (e.g., N-gram models), the above constraints have to be fulfilled in the training data. However collecting such data, even through crowd sourcing, is expensive and time consuming, and can still be not representative of how a much larger user population would interact with the recognition system. In this paper, we address this problem by employing several semantic web sources that already contain the domain-specific knowledge, such as query click logs and knowledge graphs. We build statistical language models that meet the requirements listed above for domain-specific recognition tasks where natural language is used and the user queries are about name entities in a specific domain. As a case study, in the movies domain where users' voice queries are movie related, compared to a generic web language model, a language model trained with the above resources not only yields significant perplexity and word-errorrate improvements, but also presents an approach where such language models can be rapidly developed for other domains. Copyright © 2014 ISCA. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1187
-
AlHasanRony 2022
DialoKG: Knowledge-Structure Aware Task-Oriented Dialogue Generation
Findings of the Association for Computational Linguistics: NAACL 2022 - Findings 2022;():2557-2571 Association for Computational Linguistics (ACL) 2022 Ref ID: 5458 Task-oriented dialogue generation is challenging since the underlying knowledge is often dynamic and effectively incorporating knowledge into the learning process is hard. It is particularly challenging to generate both humanlike and informative responses in this setting. Recent research primarily focused on various knowledge distillation methods where the underlying relationship between the facts in a knowledge base is not effectively captured. In this paper, we go one step further and demonstrate how the structural information of a knowledge graph can improve the system's inference capabilities. Specifically, we propose DialoKG, a novel task-oriented dialogue system that effectively incorporates knowledge into a language model. Our proposed system views relational knowledge as a knowledge graph and introduces (1) a structure-aware knowledge embedding technique, and (2) a knowledge graph-weighted attention masking strategy to facilitate the system selecting relevant information during the dialogue generation. An empirical evaluation demonstrates the effectiveness of DialoKG over state-of-theart methods on several standard benchmark datasets. © Findings of the Association for Computational Linguistics: NAACL 2022 - Findings. |
Davis
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2530
-
Al-Sabahi 2018
A Hierarchical Structured Self-Attentive Model for Extractive Document Summarization (HSSAS)
The recent advance in neural network architecture and training algorithms has shown the effectiveness of representation learning. The neural-network-based models generate better representation than the traditional ones. They have the ability to automatically learn the distributed representation for sentences and documents. To this end, we proposed a novel model that addresses several issues that are not adequately modeled by the previously proposed models, such as the memory problem and incorporating the knowledge of document structure. Our model uses a hierarchical structured self-attention mechanism to create the sentence and document embeddings. This architecture mirrors the hierarchical structure of the document and in turn enables us to obtain better feature representation. The attention mechanism provides extra source of information to guide the summary extraction. The new model treated the summarization task as a classification problem in which the model computes the respective probabilities of sentence-summary membership. The model predictions are broken up by several features such as information content, salience, novelty, and positional representation. The proposed model was evaluated on two well-known datasets, the CNN/Daily Mail and DUC 2002. The experimental results show that our model outperforms the current extractive state of the art by a considerable margin. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3936
-
Alam 2023
Towards Semantically Enriched Embeddings for Knowledge Graph Completion
arXiv 2023;(): 2023 Ref ID: 7789 Embedding based Knowledge Graph (KG) Completion has gained much attention over the past few years. Most of the current algorithms consider a KG as a multidirectional labeled graph and lack the ability to capture the semantics underlying the schematic information. In a separate development, a vast amount of information has been captured within the Large Language Models (LLMs) which has revolutionized the field of Artificial Intelligence. KGs could benefit from these LLMs and vice versa. This vision paper discusses the existing algorithms for KG completion based on the variations for generating KG embeddings. It starts with discussing various KG completion algorithms such as transductive and inductive link prediction and entity type prediction algorithms. It then moves on to the algorithms utilizing type information within the KGs, LLMs, and finally to algorithms capturing the semantics represented in different description logic axioms. We conclude the paper with a critical reflection on the current state of work in the community and give recommendations for future directions. |
Davis
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1116
-
Alatrash 2024
ConceptGCN: Knowledge concept recommendation in MOOCs based on knowledge graph convolutional networks and SBERT
Massive Open Online Courses (MOOCs) have gained popularity in the technology-enhanced learning (TEL) domain. To enhance the learning experience in MOOCs, educational recommender systems (ERSs) can play a crucial role by suggesting courses or learning materials that align with students' knowledge states. Thereby, understanding a student's learning needs and predicting knowledge concepts that the student might be interested in are important to provide effective recommendations. Inspired by the superior ability of knowledge graphs (KGs) in modeling the heterogeneous data in MOOCs and Graph Neural Networks (GNNs) in learning on graph-structured data, few works focusing on GNN-based recommendation of knowledge concepts in MOOCs have emerged recently. However, existing approaches in this domain have limitations mainly related to complexity, semantics, and transparency. To address these limitations, in this paper we propose ConceptGCN, an end-to-end framework that combines KGs, Graph Convolutional Networks (GCNs), and pre-trained transformer language model encoders (SBERT) to provide personalized and transparent recommendations of knowledge concepts in the MOOC platform [Blinded tool]. We conducted extensive offline experiments and an online user study (N=31), demonstrating the benefits of the ConceptGCN-based recommendation approach, in terms of several important user-centric aspects including accuracy, novelty, diversity, usefulness, overall satisfaction, use intentions, and reading intention. In particular, our results indicate that, if SBERT is used for the initial embeddings of items in the KG, a self-connection operation and a semantic similarity-based score function in the aggregation operation of GCN are not necessarily needed. © 2023 The Author(s) |
Davis
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2025
-
Alberts 2021
VisualSem: A High-quality Knowledge Graph for Vision & Language
MRL 2021 - 1st Workshop on Multilingual Representation Learning, Proceedings of the Conference 2021;():138-152 Association for Computational Linguistics (ACL) 2021 Ref ID: 5600 An exciting frontier in natural language understanding (NLU) and generation (NLG) calls for (vision-and-) language models that can efficiently access external structured knowledge repositories. However, many existing knowledge bases only cover limited domains, or suffer from noisy data, and most of all are typically hard to integrate into neural language pipelines. To fill this gap, we release VisualSem: a high-quality knowledge graph (KG) which includes nodes with multilingual glosses, multiple illustrative images, and visually relevant relations. We also release a neural multi-modal retrieval model that can use images or sentences as inputs and retrieves entities in the KG. This multi-modal retrieval model can be integrated into any (neural network) model pipeline. We encourage the research community to use VisualSem for data augmentation and/or as a source of grounding, among other possible uses. VisualSem as well as the multi-modal retrieval models are publicly available and can be downloaded in this URL: https://github.com/iacercalixto/visualsem. © 2021 Association for Computational Linguistics. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1687
-
Alfasi 2023
Next-Generation Security Entity Linkage: Harnessing the Power of Knowledge Graphs and Large Language
Proceedings of the 16th ACM International Conference on Systems and Storage, SYSTOR 2023 2023;():150 Association for Computing Machinery, Inc 2023 DOI: 10.1145/3579370.3594759 · Ref ID: 4811 With the continuous increase in reported Common Vulnerabilities and Exposures (CVEs), security teams are overwhelmed by vast amounts of data, which are often analyzed manually, leading to a slow and inefficient process. To address cybersecurity threats effectively, it is essential to establish connections across multiple security entity databases, including CVEs, Common Weakness Enumeration (CWEs), and Common Attack Pattern Enumeration and Classification (CAPECs). In this study, we introduce a new approach that leverages the RotatE [4] knowledge graph embedding model, initialized with embeddings from Ada language model developed by OpenAI [3]. Additionally, we extend this approach by initializing the embeddings for the relations. © 2023 Owner/Author(s). |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3281
-
Allen 2023
Conceptual Engineering Using Large Language Models
arXiv 2023;(): 2023 Ref ID: 7975 We describe a method, based on Jennifer Nado's definition of classification procedures as targets of conceptual engineering, that implements such procedures using a large language model. We then apply this method using data from the Wikidata knowledge graph to evaluate concept definitions from two paradigmatic conceptual engineering projects: the International Astronomical Union's redefinition of PLANET and Haslanger's ameliorative analysis of WOMAN. We discuss implications of this work for the theory and practice of conceptual engineering. The code and data can be found on GitHub. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3367
-
Alper 2024
Emergent Visual-Semantic Hierarchies in Image-Text Representations
arXiv 2024;(): 2024 Ref ID: 8454 While recent vision-and-language models (VLMs) like CLIP are a powerful tool for analyzing text and images in a shared semantic space, they do not explicitly model the hierarchical nature of the set of texts which may describe an image. Conversely, existing multimodal hierarchical representation learning methods require costly training from scratch, failing to leverage the knowledge encoded by state-of-the-art multimodal foundation models. In this work, we study the knowledge of existing foundation models, finding that they exhibit emergent understanding of visual-semantic hierarchies despite not being directly trained for this purpose. We propose the Radial Embedding (RE) framework for probing and optimizing hierarchical understanding, and contribute the HierarCaps dataset, a benchmark facilitating the study of hierarchical knowledge in image–text representations, constructed automatically via large language models. Our results show that foundation VLMs exhibit zero-shot hierarchical understanding, surpassing the performance of prior models explicitly designed for this purpose. Furthermore, we show that foundation models may be better aligned to hierarchical reasoning via a text-only fine-tuning phase, while retaining pretraining knowledge. |
Davis
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2540
-
Alrimawi 2018
I've Seen This Before: Sharing Cyber-Physical Incident Knowledge
2018 IEEE/ACM 1st International Workshop on Security Awareness from Design to Deployment (SEAD) 2018;():33-40 2018 DOI: 10.1145/3194707.3194714 · Ref ID: 6433 An increasing number of security incidents in cyber-physical systems (CPSs) arise from the exploitation of cyber and physical components of such systems. Knowledge about how such incidents arose is rarely captured and used systematically to enhance security and support future incident investigations. In this paper, we propose an approach to represent and share incidents knowledge. Our approach captures incident patterns – common aspects of incidents occurring in different CPSs. Our approach then allows incident patterns to be instantiated for different systems to assess if and how such patterns can manifest again. To support our approach, we provide two meta-models that represent, respectively, incident patterns and the cyber-physical systems themselves. The incident meta-model captures the characteristics of incidents, such as assets and activities. The system meta-model captures cyber and physical components and their interactions, which may be exploited during an incident. We demonstrate the feasibility of our approach in the application domain of smart buildings, by tailoring the system meta-model to represent components and interactions in this domain. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3793
-
Alshammari 2024
PyZoBot: A Platform for Conversational Information Extraction and Synthesis from Curated Zotero Reference Libraries through Advanced Retrieval-Augmented Generation
arXiv 2024;(): 2024 Ref ID: 8290 The exponential growth of scientific literature has resulted in information overload, challenging researchers to effectively synthesize relevant publications. This paper explores the integration of traditional reference management software with advanced computational techniques, including Large Language Models and Retrieval-Augmented Generation. We introduce PyZoBot, an AI-driven platform developed in Python, incorporating Zoteros reference management with OpenAIs sophisticated LLMs. PyZoBot streamlines knowledge extraction and synthesis from extensive human-curated scientific literature databases. It demonstrates proficiency in handling complex natural language queries, integrating data from multiple sources, and meticulously presenting references to uphold research integrity and facilitate further exploration. By leveraging LLMs, RAG, and human expertise through a curated library, PyZoBot offers an effective solution to manage information overload and keep pace with rapid scientific advancements. The development of such AI-enhanced tools promises significant improvements in research efficiency and effectiveness across various disciplines. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1652
-
Alshomary 2024
Modeling the Quality of Dialogical Explanations
2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():11523-11536 European Language Resources Association (ELRA) 2024 Ref ID: 4569 Explanations are pervasive in our lives. Mostly, they occur in dialogical form where an explainer discusses a concept or phenomenon of interest with an explainee. Leaving the explainee with a clear understanding is not straightforward due to the knowledge gap between the two participants. Previous research looked at the interaction of explanation moves, dialogue acts, and topics in successful dialogues with expert explainers. However, daily-life explanations often fail, raising the question of what makes a dialogue successful. In this work, we study explanation dialogues in terms of the interactions between the explainer and explainee and how they correlate with the quality of explanations in terms of a successful understanding on the explainee's side. In particular, we first construct a corpus of 399 dialogues from the Reddit forum Explain Like I am Five and annotate it for interaction flows and explanation quality. We then analyze the interaction flows, comparing them to those appearing in expert dialogues. Finally, we encode the interaction flows using two language models that can handle long inputs, and we provide empirical evidence for the effectiveness boost gained through the encoding in predicting the success of explanation dialogues. © 2024 ELRA Language Resource Association: CC BY-NC 4.0. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1947
-
Amorim 2024
text2story: A Python Toolkit to Extract and Visualize Story Components of Narrative Text
2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():15761-15772 European Language Resources Association (ELRA) 2024 Ref ID: 4504 Story components, namely, events, time, participants, and their relations are present in narrative texts from different domains such as journalism, medicine, finance, and law. The automatic extraction of narrative elements encompasses several NLP tasks such as Named Entity Recognition, Semantic Role Labeling, Event Extraction, and Temporal Inference. The text2story Python, an easy-to-use modular library, supports the narrative extraction and visualization pipeline. The package contains an array of narrative extraction tools that can be used separately or in sequence. With this toolkit, end users can process free text in English or Portuguese and obtain formal representations, like standard annotation files or a formal logical representation. The toolkit also enables narrative visualization as Message Sequence Charts (MSC), Knowledge Graphs, and Bubble Diagrams, making it useful to visualize and transform human-annotated narratives. The package combines the use of off-the-shelf and custom tools and is easily patched (replacing existing components) and extended (e.g. with new visualizations). It includes an experimental module for narrative element effectiveness assessment and being is therefore also a valuable asset for researchers developing solutions for narrative extraction. To evaluate the baseline components, we present some results of the main annotators embedded in our package for datasets in English and Portuguese. We also compare the results with the extraction of narrative elements by GPT-3, a robust LLM model. © 2024 ELRA Language Resource Association: CC BY-NC 4.0. |
Davis
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2058
-
An 2023
Construction and application of Chinese breast cancer knowledge graph based on multi-source heterogeneous data
The knowledge graph is a critical resource for medical intelligence. The general medical knowledge graph tries to include all diseases and contains much medical knowledge. However, it is challenging to review all the triples manually. Therefore the quality of the knowledge graph can not support intelligence medical applications. Breast cancer is one of the highest incidences of cancer at present. It is urgent to improve the efficiency of breast cancer diagnosis and treatment through artificial intelligence technology and improve the postoperative health status of breast cancer patients. This paper proposes a framework to construct a breast cancer knowledge graph from heterogeneous data resources in response to this demand. Specifically, this paper extracts knowledge triple from clinical guidelines, medical encyclopedias and electronic medical records. Furthermore, the triples from different data resources are fused to build a breast cancer knowledge graph (BCKG). Experimental results demonstrate that BCKG can support knowledge-based question answering, breast cancer postoperative follow-up and healthcare, and improve the quality and efficiency of breast cancer diagnosis, treatment and management. |
Srividya
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1304
-
An 2022
Exploring Pre-Trained Language Models to Build Knowledge Graph for Metal-Organic Frameworks (MOFs)
Proceedings - 2022 IEEE International Conference on Big Data, Big Data 2022 2022;():3651-3658 Institute of Electrical and Electronics Engineers Inc. 2022 DOI: 10.1109/BigData55660.2022.10020568 · Ref ID: 5450 Building a knowledge graph is a time-consuming and costly process which often applies complex natural language processing (NLP) methods for extracting knowledge graph triples from text corpora. Pre-trained large Language Models (PLM) have emerged as a crucial type of approach that provides readily available knowledge for a range of AI applications. However, it is unclear whether it is feasible to construct domain-specific knowledge graphs from PLMs. Motivated by the capacity of knowledge graphs to accelerate data-driven materials discovery, we explored a set of state-of-the-art pre-trained general-purpose and domain-specific language models to extract knowledge triples for metal-organic frameworks (MOFs). We created a knowledge graph benchmark with 7 relations for 1248 published MOF synonyms. Our experimental results showed that domain-specific PLMs consistently outperformed the general-purpose PLMs for predicting MOF related triples. The overall benchmarking results, however, show that using the present PLMs to create domain-specific knowledge graphs is still far from being practical, motivating the need to develop more capable and knowledgeable pre-trained language models for particular applications in materials science. © 2022 IEEE. |
Davis
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3591
-
An 2023
Knowledge Graph Question Answering for Materials Science (KGQA4MAT): Developing Natural Language Interface for Metal-Organic Frameworks Knowledge Graph (MOF-KG) Using LLM
arXiv 2023;(): 2023 Ref ID: 7837 We present a comprehensive benchmark dataset for Knowledge Graph Question Answering in Materials Science (KGQA4MAT), with a focus on metal-organic frameworks (MOFs). A knowledge graph for metal-organic frameworks (MOF-KG) has been constructed by integrating structured databases and knowledge extracted from the literature. To enhance MOF-KG accessibility for domain experts, we aim to develop a natural language interface for querying the knowledge graph. We have developed a benchmark comprised of 161 complex questions involving comparison, aggregation, and complicated graph structures. Each question is rephrased in three additional variations, resulting in 644 questions and 161 KG queries. To evaluate the benchmark, we have developed a systematic approach for utilizing the LLM, ChatGPT, to translate natural language questions into formal KG queries. We also apply the approach to the well-known QALD-9 dataset, demonstrating ChatGPT's potential in addressing KGQA issues for different platforms and query languages. The benchmark and the proposed approach aim to stimulate further research and development of user-friendly and efficient interfaces for querying domain-specific materials science knowledge graphs, thereby accelerating the discovery of novel materials. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#73
-
Anderson 2024
Bridging Domains in Chronic Lower Back Pain: Large Language Models and Ontology-Driven Strategies for Knowledge Graph Construction
11th International Work-Conference on Bioinformatics and Biomedical Engineering (IWBBIO) 2024;14849():14-30 Univ Granada, Meloneras, SPAIN Springer International Publishing Ag 2024 DOI: 10.1007/978-3-031-64636-2_2 · Ref ID: 2967 Link prediction and entity resolution play pivotal roles in uncovering hidden relationships within networks and ensuring data quality in the era of heterogeneous data integration. This paper explores the utilization of large language models to enhance link prediction, particularly through knowledge graphs derived from transdisciplinary literature. Investigating zero-shot entity resolution techniques, we examine the impact of ontology-based and large language model approaches on the stability of link prediction results. Through a case study focusing on chronic lower back pain research, we analyze workflow decisions and their influence on prediction outcomes. Our research underscores the importance of robust methodologies in improving predictive accuracy and data integration across diverse domains. |
Davis
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3158
-
Anelli 2024
Sixth Knowledge-aware and Conversational Recommender Systems Workshop (KaRS)
Proceedings of the 18th ACM Conference on Recommender Systems 2024;():1245–1249 Bari, Italy Association for Computing Machinery 2024 DOI: 10.1145/3640457.3687114 · Ref ID: 7283 |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2813
-
Angele 1996
Propose-and-revise modeled in Karl
Proceedings Mexico-USA Collaboration in Intelligent Systems Technologies. 1996;():278-287 1996 Ref ID: 6174 This paper reports an evaluation study for the specification of an average sized expert system for configuring elevator systems using the language KARL (Knowledge Acquisition and Representation Language). Two results have been gained in this study: (i) a formal model of the used problem-solving method (PSM) Propose-and-Revise has been developed and (ii) the adequacy of the language KARL for specifying such systems has been evaluated. KARL is based on a strong conceptual model: the KARL model of expertise, which represents different aspects of the model at different layers. It clearly separates domain specific knowledge fim probleni-solving-specific knowledge which allows to reuse both parts independenty from the other. KARL provides language primitives on a high level of abstraction, independent of implementation issues. KARL is a formal language which allows to represent knowledge unambiguously. KARL is an executable language which allows to validate the resulting model by testing and debugging. It turned out that KARL is well-suited for such specification issues. It also turned out that due to a flexible connection between domain knowledge and problem solving knowledge provided by KARL both different kinds of knowledge may be specified nearly independenty from the other which supports their reuse. This study gave us various insights into the adequacy of the language KARL for representing the knowledge on an abstract level. In spite of the encouraging results we gained this study also revealed some deficiencies of the language KARL which are currently eliminated for a future version of KARL. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3188
-
Anokhin 2024
AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents
arXiv 2024;(): 2024 Ref ID: 8447 Advancements in the capabilities of Large Language Models (LLMs) have created a promising foundation for developing autonomous agents. With the right tools, these agents could learn to solve tasks in new environments by accumulating and updating their knowledge. Current LLM-based agents process past experiences using a full history of observations, summarization, retrieval augmentation. However, these unstructured memory representations do not facilitate the reasoning and planning essential for complex decision-making. In our study, we introduce AriGraph, a novel method wherein the agent constructs and updates a memory graph that integrates semantic and episodic memories while exploring the environment. We demonstrate that our Ariadne LLM agent, consisting of the proposed memory architecture augmented with planning and decision-making, effectively handles complex tasks within interactive text game environments difficult even for human players. Results show that our approach markedly outperforms other established memory methods and strong RL baselines in a range of problems of varying complexity. Additionally, AriGraph demonstrates competitive performance compared to dedicated knowledge graph-based methods in static multi-hop question-answering. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3106
-
Aparna 2024
AI-Based Assistance for Management of Oral Community Knowledge in Low-Resource and Colloquial Kannada Language
Big Data Analytics in Astronomy, Science, and Engineering: 11th International Conference on Big Data Analytics, BDA 2023, Aizu, Japan, December 5–7, 2023, Proceedings 2024;():3–16 Aizu, Japan Springer-Verlag 2024 DOI: 10.1007/978-3-031-58502-9_1 · Ref ID: 7269 |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1283
-
Arachchige 2023
Evaluating Large Language Models in Relationship Extraction from Unstructured Data: Empirical Study from Holocaust Testimonies
International Conference Recent Advances in Natural Language Processing, RANLP 2023;():117-123 Incoma Ltd 2023 DOI: 10.26615/978-954-452-092-2_013 · Ref ID: 5047 Relationship extraction from unstructured data remains one of the most challenging tasks in the field of Natural Language Processing (NLP). The complexity of relationship extraction arises from the need to comprehend the underlying semantics, syntactic structures, and contextual dependencies within the text. Unstructured data poses challenges with diverse linguistic patterns, implicit relationships, contextual nuances, complicating accurate relationship identification and extraction. The emergence of Large Language Models (LLMs), such as GPT (Generative Pre-trained Transformer), has indeed marked a significant advancement in the field of NLP.In this work, we assess and evaluate the effectiveness of LLMs in relationship extraction in the Holocaust testimonies within the context of the Historical realm. By delving into this domainspecific context, we aim to gain deeper insights into the performance and capabilities of LLMs in accurately capturing and extracting relationships within the Holocaust domain by developing a novel knowledge graph to visualise the relationships of the Holocaust. To the best of our knowledge, there is no existing study which discusses relationship extraction in Holocaust testimonies. The majority of current approaches for Information Extraction (IE) in historic documents are either manual or Optical Character Recognition (OCR) based. Moreover, in this study, we found that the Subject-Object-Verb extraction using GPT3-based relations produced more meaningful results compared to the Semantic Role labelingbased triple extraction. © 2023 Incoma Ltd. All rights reserved. |
Davis
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2142
-
Araújo 2016
Architectural approaches to build the museum of the person
2016 11th Iberian Conference on Information Systems and Technologies (CISTI) 2016;():1-6 2016 DOI: 10.1109/CISTI.2016.7521367 · Ref ID: 6608 The Museum of the Person (Museu da Pessoa, MP) is a virtual museum aimed at exhibiting life stories of common people. Its assets are composed of several interviews involving people whose stories we want to perpetuate. So the museum holds an heterogeneous collection of XML (eXtensible Markup Language) documents that constitute the working repository. The main idea is to extract automatically the information included in the repository in order to build the web pages that realize the museum's exhibition rooms. This project started by creating a specific ontology (OntoMP) for the knowledge repository of MP. That ontology is intended to allow a conceptual navigation over the available information. We will adopt the standard for museum ontologies CIDOC-CRM (CIDOC Conceptual Reference Model) refined with FOAF to represent OntoMP. The objective of this paper is to discuss different architectural approaches to build a system that will create the virtual rooms from the XML repository to enable visitors to lookup individual life stories and also intercross information among them. The first architecture is based on a TripleStore and uses SPARQL (SPARQL Protocol and RDF Query Language) technology to extract the information, while the second proposal is based on a Relational Database and uses CaVa Generator to query the repository and build the exhibition spaces. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#423
-
Arnold 2022
Knowledge extraction from aeronautical messages (NOTAMs) with self-supervised language models for aircraft pilots
Conference of the North-American-Chapter-of-the-Association-for-Computational-Linguistics (NAAACL) - Human Language Technologies 2022;():188-196 Seattle, WA Assoc Computational Linguistics-Acl 2022 Ref ID: 3668 During their pre-flight briefings, aircraft pilots must analyze a long list of NOTAMs (NOtice To AirMen) indicating potential hazards along the flight route, sometimes up to 100 pages for long-haul flights. NOTAM free-text fields typically have a very special phrasing, with lots of acronyms and domain-specific vocabulary, which makes it differ significantly from standard English. In this paper, we pretrain language models derived from BERT on circa 1 million unlabeled NOTAMs and reuse the learnt representations on three downstream tasks valuable for pilots: criticality prediction, named entity recognition and translation into a structured language called Airlang. This self-supervised approach, where smaller amounts of labeled data are enough for task-specific finetuning, is well suited in the aeronautical context since expert annotations are expensive and time-consuming. We present evaluation scores across the tasks showing a high potential for an operational usability of such models (by pilots, airlines or service providers), which is a first to the best of our knowledge. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#375
-
Aspillaga 2021
Inspecting the concept knowledge graph encoded by modern language models
Joint Conference of 59th Annual Meeting of the Association-for-Computational-Linguistics (ACL) / 11th International Joint Conference on Natural Language Processing (IJCNLP) / 6th Workshop on Representation Learning for NLP (RepL4NLP) 2021;():2984-3000 Electr Network Assoc Computational Linguistics-Acl 2021 Ref ID: 3072 The field of natural language understanding has experienced exponential progress in the last few years, with impressive results in several tasks. This success has motivated researchers to study the underlying knowledge encoded by these models. Despite this, attempts to understand their semantic capabilities have not been successful, often leading to non-conclusive, or contradictory conclusions among different works. Via a probing classifier, we extract the underlying knowledge graph of nine of the most influential language models of the last years, including word embeddings, text generators, and context encoders. This probe is based on concept relatedness, grounded on WordNet. Our results reveal that all the models encode this knowledge, but suffer from several inaccuracies. Furthermore, we show that the different architectures and training strategies lead to different model biases. We conduct a systematic evaluation to discover specific factors that explain why some concepts are challenging. We hope our insights will motivate the development of models that capture concepts more precisely. |
Kwesi
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3760
-
Avnat 2024
Performance of large language models in numerical vs. semantic medical knowledge: Benchmarking on evidence-based Q&As
arXiv 2024;(): 2024 Ref ID: 8359 Clinical problem-solving requires processing of semantic medical knowledge such as illness scripts and numerical medical knowledge of diagnostic tests for evidence-based decision-making. As large language models (LLMs) show promising results in many aspects of language-based clinical practice, their ability to generate non-language evidence-based answers to clinical questions is inherently limited by tokenization. Therefore, we evaluated LLMs' performance on two question types: numeric (correlating findings) and semantic (differentiating entities) while examining differences within and between LLMs in medical aspects and comparing their performance to humans. To generate straightforward multi-choice questions and answers (QAs) based on evidence-based medicine (EBM), we used a comprehensive medical knowledge graph (encompassed data from more than 50,00 peer-reviewed articles) and created the "EBMQA". EBMQA contains 105,000 QAs labeled with medical and non-medical topics and classified into numerical or semantic questions. We benchmarked this dataset using more than 24,500 QAs on two state-of-the-art LLMs: Chat-GPT4 and Claude3-Opus. We evaluated the LLMs accuracy on semantic and numerical question types and according to sub-labeled topics. For validation, six medical experts were tested on 100 numerical EBMQA questions. We found that both LLMs excelled more in semantic than numerical QAs, with Claude3 surpassing GPT4 in numerical QAs. However, both LLMs showed inter and intra gaps in different medical aspects and remained inferior to humans. Thus, their medical advice should be addressed carefully. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1427
-
Azaria 2023
The Internal State of an LLM Knows When It's Lying
Findings of the Association for Computational Linguistics: EMNLP 2023 2023;():967-976 Association for Computational Linguistics (ACL) 2023 DOI: 10.18653/v1/2023.findings-emnlp.68 · Ref ID: 5057 While Large Language Models (LLMs) have shown exceptional performance in various tasks, one of their most prominent drawbacks is generating inaccurate or false information with a confident tone. In this paper, we provide evidence that the LLM's internal state can be used to reveal the truthfulness of statements. This includes both statements provided to the LLM, and statements that the LLM itself generates. Our approach is to train a classifier that outputs the probability that a statement is truthful, based on the hidden layer activations of the LLM as it reads or generates the statement. Experiments demonstrate that given a set of test sentences, of which half are true and half false, our trained classifier achieves an average of 71% to 83% accuracy labeling which sentences are true versus false, depending on the LLM base model. Furthermore, we explore the relationship between our classifier's performance and approaches based on the probability assigned to the sentence by the LLM. We show that while LLM-assigned sentence probability is related to sentence truthfulness, this probability is also dependent on sentence length and the frequencies of words in the sentence, resulting in our trained classifier providing a more reliable approach to detecting truthfulness, highlighting its potential to enhance the reliability of LLM-generated content and its practical applicability in real-world scenarios. © 2023 Association for Computational Linguistics. |
Davis
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#324
-
Azim 2024
Grounding Ontologies with Pre-Trained Large Language Models for Activity Based Intelligence
Conference on Signal Processing, Sensor/Information Fusion, and Target Recognition XXXIII 2024;13057(): National Harbor, MD Spie-Int Soc Optical Engineering 2024 DOI: 10.1117/12.3013332 · Ref ID: 3490 The development of Activity Based Intelligence (ABI) requires an understanding of individual actors' intents, their interactions with other entities in the environment, and how these interactions facilitate accomplishment of their goals. Statistical modelling alone is insufficient for such analyses, mandating higher-level representations such as ontology to capture important relationships. However, constructing ontologies for ABI, ensuring they remain grounded to real-world entities, and maintaining their applicability to downstream tasks requires substantial hand-tooling by domain experts. In this paper, we propose the use of a Large Language Model (LLM) to bootstrap a grounding for such an ontology. Subsequently, we demonstrate that the experience encoded within the weights of a pre-trained LLM can be used in a zero-shot manner to provide a model of normalcy, enabling ABI analysis at the semantics level, agnostic to the precise coordinate data. This is accomplished through a sequence of two transformations, made upon a kinematic track, toward natural language narratives suitable for LLM input. The first transformation generates an abstraction of the low-level kinematic track, embedding it within a knowledge graph using a domain-specific ABI ontology. Secondly, we employ a template-driven narrative generation process to form natural language descriptions of behavior. Computation of the LLM perplexity score upon these narratives achieves grounding of the ontology. This use does not rely on any prompt engineering. In characterizing the perplexity score for any given track, we observe significant variability given chosen parameters such as sentence verbosity, attribute count, clause ordering, and so on. Consequently, we propose an approach that considers multiple generated narratives for an individual track and the distribution of perplexity scores for downstream applications. We demonstrate the successful application of this methodology against a semantic track association task. Our subsequent analysis establishes how such an approach can be used to augment existing kinematics-based association algorithms. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#175
-
Baek 2023
Direct Fact Retrieval from Knowledge Graphs without Entity Linking
61st Annual Meeting of the the Association-for-Computational-Linguistics (ACL) 2023;():10038-10055 Toronto, CANADA Assoc Computational Linguistics-Acl 2023 Ref ID: 3227 There has been a surge of interest in utilizing Knowledge Graphs (KGs) for various natural language processing/understanding tasks. The conventional mechanism to retrieve facts in KGs usually involves three steps: entity span detection, entity disambiguation, and relation classification. However, this approach requires additional labels for training each of the three subcomponents in addition to pairs of input texts and facts, and also may accumulate errors propagated from failures in previous steps. To tackle these limitations, we propose a simple knowledge retrieval framework, which directly retrieves facts from the KGs given the input text based on their representational similarities, which we refer to as Direct Fact Retrieval (DiFaR). Specifically, we first embed all facts in KGs onto a dense embedding space by using a language model trained by only pairs of input texts and facts, and then provide the nearest facts in response to the input text. Since the fact, consisting of only two entities and one relation, has little context to encode, we propose to further refine ranks of top-k retrieved facts with a reranker that contextualizes the input text and the fact jointly. We validate our DiFaR framework on multiple fact retrieval tasks, showing that it significantly outperforms relevant baselines that use the three-step approach. |
Davis
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#463
-
Baghdasaryan 2024
Knowledge retrieval and diagnostics in cloud services with large language models
Efficient customer support is the foundation for any service provider trying to improve customer relationships. An important measure of successful support is the mean time to resolve issues. The complexity and large scale of modern cloud environments make it unrealistic to reduce the resolution time without deploying intelligent solutions. The latest also provides an exceptional opportunity to leverage cross-customer product usage data for proactive solutions when the troubles of some users can be analyzed in advance to prevent similar issues of other users. We build a recommender system that matches customer support requests to other resolved support requests or knowledge base articles that contain valuable information for problem remediation. This system can be used by customers or support teams to quickly find problem-resolution tips or detect trending issues to warn vulnerable users. We utilize large language models, fine-tune for better performance, and discuss capabilities and possible improvements. During our research, we highlighted several evaluation metrics such as mean time to resolve issues and the accuracy of recommendations. However, estimating accuracy is challenging due to insufficient datasets with precise and comprehensive recommendations. Despite this, our support managers provided some estimates regarding the remediation durations. Typically, identifying and resolving an issue takes several days or weeks. With appropriate recommendations, this time can be significantly reduced to several hours and, in some simple cases, even lead to self-service capabilities. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3587
-
Bahr 2024
Knowledge Graph Enhanced Retrieval-Augmented Generation for Failure Mode and Effects Analysis
arXiv 2024;(): 2024 Ref ID: 8426 Failure mode and effects analysis (FMEA) is a critical tool for mitigating potential failures, particular during ramp-up phases of new products. However, its effectiveness is often limited by the missing reasoning capabilities of the FMEA tools, which are usually tabular structured. Meanwhile, large language models (LLMs) offer novel prospects for fine-tuning on custom datasets for reasoning within FMEA contexts. However, LLMs face challenges in tasks that require factual knowledge, a gap that retrieval-augmented generation (RAG) approaches aim to fill. RAG retrieves information from a non-parametric data store and uses a language model to generate responses. Building on this idea, we propose to advance the non-parametric data store with a knowledge graph (KG). By enhancing the RAG framework with a KG, our objective is to leverage analytical and semantic question-answering capabilities on FMEA data. This paper contributes by presenting a new ontology for FMEA observations, an algorithm for creating vector embeddings from the FMEA KG, and a KG enhanced RAG framework. Our approach is validated through a human study and we measure the performance of the context retrieval recall and precision. |
Kwesi
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#480
-
Bai 2023
KnowPrefix-Tuning: A Two-Stage Prefix-Tuning Framework for Knowledge-Grounded Dialogue Generation
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD) 2023;14170():525-542 Turin, ITALY Springer International Publishing Ag 2023 DOI: 10.1007/978-3-031-43415-0_31 · Ref ID: 3766 Existing knowledge-grounded conversation systems generate responses typically in a retrieve-then-generate manner. They require a large knowledge base and a strong knowledge retrieval component, which is time- and resource-consuming. In this paper, we address the challenge by leveraging the inherent knowledge encoded in the pre-trained language models (PLMs). We propose Knowledgeable Prefix Tuning (KnowPrefix-Tuning), a two-stage tuning framework, bypassing the retrieval process in a knowledge-grounded conversation system by injecting prior knowledge into the lightweight knowledge prefix. The knowledge prefix is a sequence of continuous knowledge-specific vectors that can be learned during training. In addition, we propose a novel interactive re-parameterization mechanism that allows the prefix to interact fully with the PLM during the optimization of response generation. Experimental results demonstrate that KnowPrefix-Tuning outperforms fine-tuning and other lightweight tuning approaches, and performs comparably with strong retrieval-based baselines while being 3x faster during inference (The code is available at https://github.com/fantast4ever/KnowPrefix-Tuning.) |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#373
-
Bai 2024
Infusing internalized knowledge of language models into hybrid prompts for knowledgeable dialogue generation
Existing knowledge-grounded dialogue (KGD) systems access the knowledge from an external knowledge base, then generate the context-coherent response accordingly. However, the knowledge access capability is constrained to the scale of a knowledge base. On the one hand, a small-scale knowledge base makes a model hard to generalize on unseen topics, while the improper shift of topics may induce an unsmooth conversation flow. On the other hand, a large-scale knowledge base requires a strong retrieval component to accurately index the context-relevant knowledge from many plausible candidates, costing significant amounts of time and resources. To address this, we regard the language model as a virtual knowledge base and propose homogenizing internalized knowledge of different language models into hybrid prompts. The hybrid prompts are a set of continuous vectors learned to represent knowledge inherently encoded in different language models. Furthermore, we devise a two-stage knowledge-grounding manner, in which both the knowledge internalized in language models and the knowledge provided by evidence can be jointly optimized to generate a knowledgeable response. We compare our proposed method with two groups of methods, including methods with explicit knowledge retrieval and those with implicit knowledge access. Experimental results on three knowledge-grounded dialogue corpora demonstrate advantages over these competitive methods. |
Davis
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1731
-
Baldazzi 2024
“Please, Vadalog, tell me why”: Interactive Explanation of Datalog-based Reasoning
Advances in Database Technology - EDBT 2024;27():834-837 OpenProceedings.org 2024 DOI: 10.48786/edbt.2024.82 · Ref ID: 4080 Integrating Large Language Models (LLMs) with logic-based Enterprise Knowledge Graphs (EKGs) and more generally with Knowledge Representation and Reasoning (KRR) approaches is currently at the forefront of research in many data-intensive areas, as language models may complement EKGs and ontological reasoning with flexibility and human orientation. Conversely, EKGs provide transparency and explainability on the conclusions drawn, a typical weak point of LLMs, which operate opaquely. In this demo, we integrate Llama 2 with our reasoning system Vadalog and use it to turn a chase graph, i.e., the trace of an ontological reasoning process, into a human-readable business report. In other words, we show the amazing capabilities of state-of-the-art LLMs in combination with a principled exploitation of the theoretical underpinnings of logic-based reasoning. We walk the audience through a visual environment, unfolding real-world reasoning settings from the Central Bank of Italy. © 2024 Copyright held by the owner/author(s). |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1297
-
Baldazzi 2024
Explaining Enterprise Knowledge Graphs with Large Language Models and Ontological Reasoning
OpenAccess Series in Informatics 2024;119(): Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing 2024 DOI: 10.4230/OASIcs.Tannen.2024.1 · Ref ID: 4000 In recent times, the demand for transparency and accountability in AI-driven decisions has intensified, particularly in high-stakes domains like finance and bio-medicine. This focus on the provenance of AI-generated conclusions underscores the need for decision-making processes that are not only transparent but also readily interpretable by humans, to built trust of both users and stakeholders. In this context, the integration of state-of-the-art Large Language Models (LLMs) with logic-oriented Enterprise Knowledge Graphs (EKGs) and the broader scope of Knowledge Representation and Reasoning (KRR) methodologies is currently at the cutting edge of industrial and academic research across numerous data-intensive areas. Indeed, such a synergy is paramount as LLMs bring a layer of adaptability and human-centric understanding that complements the structured insights of EKGs. Conversely, the central role of ontological reasoning is to capture the domain knowledge, accurately handling complex tasks over a given realm of interest, and to infuse the process with transparency and a clear provenance-based explanation of the conclusions drawn, addressing the fundamental challenge of LLMs' inherent opacity and fostering trust and accountability in AI applications. In this paper, we propose a novel neuro-symbolic framework that leverages the underpinnings of provenance in ontological reasoning to enhance state-of-the-art LLMs with domain awareness and explainability, enabling them to act as natural language interfaces to EKGs. © Teodoro Baldazzi, Luigi Bellomarini, Stefano Ceri, Andrea Colombo, Andrea Gentili, Emanuel Sallinger, and Paolo Atzeni; licensed under Creative Commons License CC-BY 4.0. |
Srividya
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3844
-
Balepur 2024
Reverse Question Answering: Can an LLM Write a Question so Hard (or Bad) that it Can't Answer?
arXiv 2024;(): 2024 Ref ID: 8737 Question answering (QA)-producing correct answers for input questions-is popular, but we test a reverse question answering (RQA) task: given an input answer, generate a question with that answer. Past work tests QA and RQA separately, but we test them jointly, comparing their difficulty, aiding benchmark design, and assessing reasoning consistency. 16 LLMs run QA and RQA with trivia questions/answers, showing: 1) Versus QA, LLMs are much less accurate in RQA for numerical answers, but slightly more accurate in RQA for textual answers; 2) LLMs often answer their own invalid questions from RQA accurately in QA, so RQA errors are not from knowledge gaps alone; 3) RQA errors correlate with question difficulty and inversely correlate with answer frequencies in the Dolma corpus; and 4) LLMs struggle to give valid multi-hop questions. By finding question and answer types yielding RQA errors, we suggest improvements for LLM RQA reasoning. |
Ishan
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1838
-
Banerjee 2023
The Role of Output Vocabulary in T2T LMs for SPARQL Semantic Parsing
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2023;():12219-12228 Association for Computational Linguistics (ACL) 2023 Ref ID: 5199 In this work, we analyse the role of output vocabulary for text-to-text (T2T) models on the task of SPARQL semantic parsing. We perform experiments within the the context of knowledge graph question answering (KGQA), where the task is to convert questions in natural language to the SPARQL query language. We observe that the query vocabulary is distinct from human vocabulary. Language Models (LMs) are pre-dominantly trained for human language tasks, and hence, if the query vocabulary is replaced with a vocabulary more attuned to the LM tokenizer, the performance of models may improve. We carry out carefully selected vocabulary substitutions on the queries and find absolute gains in the range of 17% on the GrailQA dataset. © 2023 Association for Computational Linguistics. |
Davis
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#748
-
Banerjee 2020
Self-Supervised Knowledge Triplet Learning for Zero-Shot Question Answering
Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020;():151-162 Electr Network Assoc Computational Linguistics-Acl 2020 Ref ID: 3298 The aim of all Question Answering (QA) systems is to generalize to unseen questions. Current supervised methods are reliant on expensive data annotation. Moreover, such annotations can introduce unintended annotator bias, making systems focus more on the bias than the actual task. This work proposes Knowledge Triplet Learning (KTL), a self-supervised task over knowledge graphs. We propose heuristics to create synthetic graphs for commonsense and scientific knowledge. We propose using KTL to perform zero-shot question answering, and our experiments show considerable improvements over large pre-trained transformer language models. |
Ishan
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#497
-
Banerjee 2024
Large Language Models for Few-Shot Automatic Term Extraction
29th International Conference on Applications of Natural Language to Information Systems (NLDB) 2024;14762():137-150 Univ Turin, Turin, ITALY Springer International Publishing Ag 2024 DOI: 10.1007/978-3-031-70239-6_10 · Ref ID: 3176 Automatic term extraction is the process of identifying domain-specific terms in a text using automated algorithms and is a key first step in ontology learning and knowledge graph creation. Large language models have shown good few-shot capabilities, thus, in this paper, we present a study to evaluate the few-shot in-context learning performance of GPT-3.5-Turbo on automatic term extraction. To benchmark the performance we compare the results with fine-tuning of a BERT-sized model. We also carry out experiments with count-based term extractors to assess their applicability to few-shot scenarios. We quantify prompt sensitivity with experiments to analyze the variation in performance of large language models across different prompt templates. Our results show that in-context learning with GPT-3.5-Turbo outperforms the BERT-based model and unsupervised count-based methods in few-shot scenarios. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#331
-
Bao 2020
HHH: An Online Medical Chatbot System based on Knowledge Graph and Hierarchical Bi-Directional Attention
Australasian Computer Science Week Multiconference (ACSW) 2020;(): Swinburne Univ Technol, Melbourne, AUSTRALIA Assoc Computing Machinery 2020 Ref ID: 3039 This paper proposes a chatbot framework that adopts a hybrid model which consists of a knowledge graph and a text similarity model. Based on this chatbot framework, we build HHH, an online question-and-answer (QA) Healthcare Helper system for answering complex medical questions. HHH maintains a knowledge graph constructed from medical data collected from the Internet. HHH also implements a novel text representation and similarity deep learning model, Hierarchical BiLSTM Attention Model (HBAM), to find the most similar question from a large QA dataset. We compare HBAM with other state-of-the-art language models such as bidirectional encoder representation from transformers (BERT) and Manhattan LSTM Model (MaLSTM). We train and test the models with a subset of the Quora duplicate questions dataset in the medical area. The experimental results show that our model is able to achieve a superior performance than these existing methods. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3335
-
Bao 2023
DISC-MedLLM: Bridging General Large Language Models and Real-World Medical Consultation
arXiv 2023;(): 2023 Ref ID: 7821 We propose DISC-MedLLM, a comprehensive solution that leverages Large Language Models (LLMs) to provide accurate and truthful medical response in end-to-end conversational healthcare services. To construct high-quality Supervised Fine-Tuning (SFT) datasets, we employ three strategies: utilizing medical knowledge-graphs, reconstructing real-world dialogues, and incorporating human-guided preference rephrasing. These datasets are instrumental in training DISC-MedLLM, surpassing existing medical LLMs in both single-turn and multi-turn consultation scenarios. Extensive experimental results demonstrate the effectiveness of the proposed model in bridging the gap between general language models and real-world medical consultation. Additionally, we release the constructed dataset and model weights to further contribute to research and development. Further details and resources can be found at https://github.com/FudanDISC/DISC-MedLLM |
Davis
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1246
-
Bayat 2024
Enhanced Language Model Truthfulness with Learnable Intervention and Uncertainty Expression
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():12388-12400 Association for Computational Linguistics (ACL) 2024 Ref ID: 4285 Large language models (LLMs) can generate long-form and coherent text, yet they often hallucinate facts, which undermines their reliability. To mitigate this issue, inference-time methods steer LLM representations toward the “truthful directions” previously learned for truth elicitation. However, applying these truthful directions with the same intensity fails to generalize across different query contexts. We propose LITO, a Learnable Intervention method for Truthfulness Optimization that automatically identifies the optimal intervention intensity tailored to each specific context. LITO explores a sequence of model generations based on increasing levels of intervention intensities. It selects the most accurate response or refuses to answer when the predictions are highly uncertain. Experiments on multiple LLMs and question-answering datasets demonstrate that LITO improves truthfulness while preserving task accuracy. The adaptive nature of LITO counters the limitations of one-size-fits-all intervention methods, maximizing truthfulness by reflecting the model's internal knowledge only when it is confident. Our code is available at https://github.com/launchnlp/LITO. © 2024 Association for Computational Linguistics. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2628
-
Bayrak 2005
Learning contextual behavior of text data
Fourth International Conference on Machine Learning and Applications (ICMLA'05) 2005;():6 pp. 2005 DOI: 10.1109/ICMLA.2005.46 · Ref ID: 6089 Understanding contextual behavior is very important in order to develop a context-aware retrieval system. This paper discusses the philosophy behind the development of the "evolutionary behavior of textual semantics" (EBOTS) system. The EBOTS system is retrieval oriented knowledge representation and management system. This paper proposes a formal model of correlation that can be combined with traditional local and global weighing schemes. Intuitive contextual behavior is studied as a part of proposed research work. Context retrieval based on semantic knowledge allows abstract queries to be defined, instead of exact word-based queries. The results of the context retrieval for a classic3 and time dataset using the EBOTS system have been discussed in this paper. The paper makes a contribution to the semantic knowledge representation and retrieval algorithms. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3544
-
Beigi 2024
InternalInspector I²: Robust Confidence Estimation in LLMs through Internal States
arXiv 2024;(): 2024 Ref ID: 8398 Despite their vast capabilities, Large Language Models (LLMs) often struggle with generating reliable outputs, frequently producing high-confidence inaccuracies known as hallucinations. Addressing this challenge, our research introduces InternalInspector, a novel framework designed to enhance confidence estimation in LLMs by leveraging contrastive learning on internal states including attention states, feed-forward states, and activation states of all layers. Unlike existing methods that primarily focus on the final activation state, InternalInspector conducts a comprehensive analysis across all internal states of every layer to accurately identify both correct and incorrect prediction processes. By benchmarking InternalInspector against existing confidence estimation methods across various natural language understanding and generation tasks, including factual question answering, commonsense reasoning, and reading comprehension, InternalInspector achieves significantly higher accuracy in aligning the estimated confidence scores with the correctness of the LLM's predictions and lower calibration error. Furthermore, InternalInspector excels at HaluEval, a hallucination detection benchmark, outperforming other internal-based confidence estimation methods in this task. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#36
-
Bellan 2022
Assisted Process Knowledge Graph Building Using Pre-trained Language Models
21st International Conference of the Italian-Association-for-Artificial-Intelligence (AIxIA) 2022;13796():60-74 Udine, ITALY Springer International Publishing Ag 2022 DOI: 10.1007/978-3-031-27181-6_5 · Ref ID: 2928 The automated construction of knowledge graphs from procedural documents is a challenging research area. Here, the lack of annotated data, as well as raw text repositories describing real-world procedural documents, make it extremely difficult to adopt deep learning approaches. Pre-trained language models have shown promising results concerning the knowledge extraction tasks from the models themselves. Although several works explored this strategy to build knowledge graph, the viability of knowledge base construction by using prompt-based learning strategy from such language models has not yet been investigated deeply. In this work, we present a prompt-based in-context learning strategy to extract, from natural language process descriptions, conceptual information that can be converted into their equivalent knowledge graphs. Such a strategy is performed in a multi-turn dialog fashion. We validate the accuracy of the proposed approach from both quantitative and qualitative perspectives. The results highlight the feasibility of the proposed approach within low-resource scenarios. |
Xinchen
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#669
-
Bellan 2024
Process Knowledge Extraction and Knowledge Graph Construction Through Prompting: A Quantitative Analysis
39th Annual ACM Symposium on Applied Computing (SAC) 2024;():1634-1641 Univ Salamanca, Avila, SPAIN Assoc Computing Machinery 2024 DOI: 10.1145/3605098.3635957 · Ref ID: 3080 The automated construction of process knowledge graphs from process description documents is a challenging research area. Here, the lack of massive annotated data, as well as raw text repositories describing real-world process documents, makes it extremely difficult to adopt deep learning approaches to perform this transformation. Indeed, the main challenge is to extract conceptual elements representing the actual entities or relations of the process model described within its corresponding natural language document. Large Language Models (LLMs) have shown promising results in supporting the extraction of structured knowledge from unstructured texts. Although several works explored this strategy to build or complete knowledge graphs, the exploitation of LLMs toward domain-specific knowledge base construction from scratch has not yet been investigated deeply. Our aim is to exploit the LLM capabilities to extract process knowledge from unseen natural language descriptions. In this work, we present a prompt-based in-context learning strategy to extract, from process descriptions, conceptual information that can be converted into their equivalent knowledge graphs. Such a strategy is performed in a multi-turn dialog fashion. We validate the accuracy of the proposed approach from a quantitative perspective. The results highlight the feasibility of the proposed approach within our low-resource scenarios and open interesting perspectives for future activities. |
Xinchen
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3578
-
Bendiken 2024
KNOW: A Real-World Ontology for Knowledge Capture with Large Language Models
arXiv 2024;(): 2024 Ref ID: 8333 We present KNOW–the Knowledge Navigator Ontology for the World–the first ontology designed to capture everyday knowledge to augment large language models (LLMs) in real-world generative AI use cases such as personal AI assistants. Our domain is human life, both its everyday concerns and its major milestones. We have limited the initial scope of the modeled concepts to only established human universals: spacetime (places, events) plus social (people, groups, organizations). The inclusion criteria for modeled concepts are pragmatic, beginning with universality and utility. We compare and contrast previous work such as Schema.org and Cyc–as well as attempts at a synthesis of knowledge graphs and language models–noting how LLMs already encode internally much of the commonsense tacit knowledge that took decades to capture in the Cyc project. We also make available code-generated software libraries for the 12 most popular programming languages, enabling the direct use of ontology concepts in software engineering. We emphasize simplicity and developer experience in promoting AI interoperability. |
Mike
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1115
-
Bertini 2024
Concept2Text: an explainable multilingual rewriting of concepts into natural language
CEUR Workshop Proceedings 2024;3733(): CEUR-WS 2024 Ref ID: 4442 Automated and explainable data interpretation hinges on two critical steps: (i) identifying emerging properties from data and representing them into abstract concepts, and (ii) translating such concepts into natural language. While Large Language Models have recently demonstrated impressive capabilities in generating natural language, their trustworthiness remains difficult to ascertain. The deployment of an explainable pipeline enables its application in high-risk activities, such as decision making. Addressing this demanding requirement is facilitated by the fertile ground of knowledge representation and automated reasoning research. Building upon previous work that explored the first step, we focus on the second step, named Concept2Text. The design of an explainable translation naturally lends itself to a logic-based model, once again highlighting the contribution of declarative programming to achieving explainability in AI. This paper explores a Prolog/CLP-based rewriting system designed to interpret concepts expressed in terms of classes and relations derived from a generic ontology, generating text in natural language. Its key features encompass hierarchical tree rewritings, modular multilingual generation, support for equivalent variants across semantic, grammar, and lexical levels, and a transparent rule-based system. We present the architecture and illustrate a simple working example that allows the generation of hundreds of different and equivalent rewritings relative to the input concept. © 2024 Copyright for this paper by its authors. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2423
-
Beydoun 2009
FAML: A Generic Metamodel for MAS Development
IEEE Transactions on Software Engineering 2009;35(6):841-863 2009 DOI: 10.1109/TSE.2009.34 · Ref ID: 6529 In some areas of software engineering research, there are several metamodels claiming to capture the main issues. Though it is profitable to have variety at the beginning of a research field, after some time, the diversity of metamodels becomes an obstacle, for instance to the sharing of results between research groups. To reach consensus and unification of existing metamodels, metamodel-driven software language engineering can be applied. This paper illustrates an application of software language engineering in the agent-oriented software engineering research domain. Here, we introduce a relatively generic agent-oriented metamodel whose suitability for supporting modeling language development is demonstrated by evaluating it with respect to several existing methodology-specific metamodels. First, the metamodel is constructed by a combination of bottom-up and top-down analysis and best practice. The concepts thus obtained and their relationships are then evaluated by mapping to two agent-oriented metamodels: TAO and Islander. We then refine the metamodel by extending the comparisons with the metamodels implicit or explicit within five more extant agent-oriented approaches: Adelfe, PASSI, Gaia, INGENIAS, and Tropos. The resultant FAML metamodel is a potential candidate for future standardization as an important component for engineering an agent modeling language. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#442
-
Bhana 2022
Knowledge Graph Fusion for Language Model Fine-Tuning
9th International Conference on Soft Computing and Machine Intelligence (ISCMI) 2022;():167-172 Toronto, CANADA Ieee 2022 DOI: 10.1109/iscmi56532.2022.10068451 · Ref ID: 3063 Language Models such as BERT (Bidirectional Encoder Representations from Transformers) have grown in popularity due to their ability to be pre-trained and perform robustly on a wide range of Natural Language Processing tasks. Often seen as an evolution over traditional word embedding techniques, they can produce semantic representations of text, useful for tasks such as semantic similarity. However, state-of-the-art models often have high computational requirements and lack global context or domain knowledge which is required for complete language understanding. To address these limitations, we investigate the benefits of knowledge incorporation into the fine-tuning stages of BERT. An existing K-BERT model, which enriches sentences with triplets from a Knowledge Graph, is adapted for the English language and extended to inject contextually relevant information into sentences. As a sideeffect, changes made to K-BERT for accommodating the English language also extend to other word-based languages. Experiments conducted indicate that injected knowledge introduces noise. We see statistically significant improvements for knowledge-driven tasks when this noise is minimised. We show evidence that, given the appropriate task, modest injection with relevant, high-quality knowledge is most performant. |
Davis
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1711
-
Bhargava 2024
Overcoming the Challenges of Large Language Models: Introducing a Novel Proposition for Synthetic Data Validation
2024 IEEE 7th International Conference on Big Data and Artificial Intelligence, BDAI 2024 2024;():290-295 Institute of Electrical and Electronics Engineers Inc. 2024 DOI: 10.1109/BDAI62182.2024.10692968 · Ref ID: 4146 The market debut of ChatGPT gave rise to the development and deployment of various other Large Language Models (LLMs) that achieve state-of-the-art performance across various tasks. The growing popularity of these models has captivated some to attempt to construct or enhance their own LLM. We must be aware of the significant problems that already exist and that we might face along the way. This paper aims to identify and investigate the main challenges in this field, provide existing solutions, and propose novel approaches to mitigate them. A unique Truth-Table proposition for validating synthetic data is presented examining two models, along with a bidirectional knowledge graph-based solution for curing the reverse curse problem, data generation strategies, domain adaptation methods, and the use of a custom dataset to address model hallucinations. The methodology and findings of this study provide valuable insights for users, researchers, and industry experts who are interested in LLMs. It serves as a reference for future research on current models, refining models or developing domain-specific ones. © 2024 IEEE. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#367
-
Bhatia 2023
Inductive Reasoning in Minds and Machines
Induction-the ability to generalize from existing knowledge-is the cornerstone of intelligence. Cognitive models of human induction are largely limited to toy problems and cannot make quantitative predictions for the thousands of different induction arguments that have been studied by researchers, or to the countless induction arguments that could be encountered in everyday life. Leading large language models (LLMs) go beyond toy problems but fail to mimic observed patterns of human induction. In this article, we combine rich knowledge representations obtained from LLMs with theories of human inductive reasoning developed by cognitive psychologists. We show that this integrative approach can capture several benchmark empirical findings on human induction and generate human-like responses to natural language arguments with thousands of common categories and properties. These findings shed light on the cognitive mechanisms at play in human induction and show how existing theories in psychology and cognitive science can be integrated with new methods in artificial intelligence, to successfully model high-level human cognition. |
Davis
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3863
-
Bhusal 2024
SECURE: Benchmarking Large Language Models for Cybersecurity Advisory
arXiv 2024;(): 2024 Ref ID: 8336 Large Language Models (LLMs) have demonstrated potential in cybersecurity applications but have also caused lower confidence due to problems like hallucinations and a lack of truthfulness. Existing benchmarks provide general evaluations but do not sufficiently address the practical and applied aspects of LLM performance in cybersecurity-specific tasks. To address this gap, we introduce the SECURE (Security Extraction, Understanding & Reasoning Evaluation), a benchmark designed to assess LLMs performance in realistic cybersecurity scenarios. SECURE includes six datasets focussed on the Industrial Control System sector to evaluate knowledge extraction, understanding, and reasoning based on industry-standard sources. Our study evaluates seven state-of-the-art models on these tasks, providing insights into their strengths and weaknesses in cybersecurity contexts, and offer recommendations for improving LLMs reliability as cyber advisory tools. |
yuexi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#111
-
Bi 2024
CodeKGC: Code Language Model for Generative Knowledge Graph Construction
ACM Trans. Asian Low-Resour. Lang. Inf. Process. 2024;23(3):16 2024 DOI: 10.1145/3641850 · Ref ID: 2936 Current generative knowledge graph construction approaches usually fail to capture structural knowledge by simply flattening natural language into serialized texts or a specification language. However, large generative language model trained on structured data such as code has demonstrated impressive capability in understanding natural language for structural prediction and reasoning tasks. Intuitively, we address the task of generative knowledge graph constructionwith code languagemodel: given a code-format natural language input, the target is to generate triples which can be represented as code completion tasks. Specifically, we develop schema-aware prompts that effectively utilize the semantic structure within the knowledge graph. As code inherently possesses structure, such as class and function definitions, it serves as a useful model for prior semantic structural knowledge. Furthermore, we employ a rationale-enhanced generation method to boost the performance. Rationales provide intermediate steps, thereby improving knowledge extraction abilities. Experimental results indicate that the proposed approach can obtain better performance on benchmark datasets compared with baselines. |
Davis
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2676
-
BinoPatricPrakash 2014
Mining semantic representation from medical text: A Bayesian approach
2014 International Conference on Recent Trends in Information Technology 2014;():1-4 2014 DOI: 10.1109/ICRTIT.2014.6996197 · Ref ID: 6237 Machine learning is a subfield of artificial intelligence that deals with the exploration and construction of systems that can learn from data. Machine learning trains the computers to manage the critical situations via examining, self-training, inference by observation and previous experience. This paper provides an overview of the development of an efficient classifier that represents the semantics in medical data (Medline) using a Machine Learning (ML) perspective. In recent days people are more concerned about their health and explore ways to identify health related information. But the process of identifying the semantic representation for the medical terms is a difficult task. The main goal of our work was to identify the semantic representation for the medical abstracts in the Medline repository using Machine Learning and Natural Language Processing (NLP). |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#233
-
Biswas 2022
Entity Type Prediction Leveraging Graph Walks and Entity Descriptions
21st International Semantic Web Conference (ISWC) 2022;13489():392-410 Electr Network Springer International Publishing Ag 2022 DOI: 10.1007/978-3-031-19433-7_23 · Ref ID: 3398 The entity type information in Knowledge Graphs (KGs) such as DBpedia, Freebase, etc. is often incomplete due to automated generation or human curation. Entity typing is the task of assigning or inferring the semantic type of an entity in a KG. This paper presents GRAND, a novel approach for entity typing leveraging different graph walk strategies in RDF2vec together with textual entity descriptions. RDF2vec first generates graph walks and then uses a language model to obtain embeddings for each node in the graph. This study shows that the walk generation strategy and the embedding model have a significant effect on the performance of the entity typing task. The proposed approach outperforms the baseline approaches on the benchmark datasets DBpedia and FIGER for entity typing in KGs for both fine-grained and coarse-grained classes. The results show that the combination of orderaware RDF2vec variants together with the contextual embeddings of the textual entity descriptions achieve the best results. |
Srividya
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1135
-
Biswas 2021
Contextual language models for knowledge graph completion
CEUR Workshop Proceedings 2021;2997(): CEUR-WS 2021 Ref ID: 5604 Knowledge Graphs (KGs) have become the backbone of various machine learning based applications over the past decade. However, the KGs are often incomplete and inconsistent. Several representation learning based approaches have been introduced to complete the missing information in KGs. Besides, Neural Language Models (NLMs) have gained huge momentum in NLP applications. However, exploiting the contextual NLMs to tackle the Knowledge Graph Completion (KGC) task is still an open research problem. In this paper, a GPT-2 based KGC model is proposed and is evaluated on two benchmark datasets. The initial results obtained from the fine-tuning of the GPT-2 model for triple classification strengthens the importance of usage of NLMs for KGC. Also, the impact of contextual language models for KGC has been discussed. © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). |
Davis
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#89
-
Biswas 2021
Cat2Type: Wikipedia Category Embeddings for Entity Typing in Knowledge Graphs
11th Knowledge Capture Conference (K-CAP) 2021;():81-88 Electr Network Assoc Computing Machinery 2021 DOI: 10.1145/3460210.3493575 · Ref ID: 3239 The entity type information in Knowledge Graphs (KGs) such as DBpedia, Freebase, etc. is often incomplete due to automated generation. Entity Typing is the task of assigning or inferring the semantic type of an entity in a KG. This paper introduces an approach named Cat2Type which exploits the Wikipedia Categories to predict the missing entity types in a KG. This work extracts information from Wikipedia Category names and the Wikipedia Category graph which are the sources of rich semantic information about the entities. In Cat2Type, the characteristic features of the entities encapsulated in Wikipedia Category names are exploited using Neural Language Models. On the other hand, a Wikipedia Category graph is constructed to capture the connection between the categories. The Node level representations are learned by optimizing the neighbourhood information on the Wikipedia category graph. These representations are then used for entity type prediction via classification. The performance of Cat2Type is assessed on two real-world benchmark datasets DBpedia630k and FIGER. The experiments depict that Cat2Type obtained a significant improvement over state-of-the-art approaches. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2151
-
Bleidt 2024
ArtQuest: Countering Hidden Language Biases in ArtVQA
2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2024;():7311-7320 2024 DOI: 10.1109/WACV57701.2024.00716 · Ref ID: 6992 The task of Visual Question Answering (VQA) has been studied extensively on general-domain real-world images. Transferring insights from general domain VQA to the art domain (ArtVQA) is non-trivial, as the latter requires models to identify abstract concepts, details of brushstrokes and styles of paintings in the visual data as well as possess background knowledge about art. This is exacerbated by the lack of high-quality datasets. In this work, we shed light on hidden linguistic biases in the AQUA dataset, which is the only publicly available benchmark dataset for ArtVQA. As a result, the majority of questions can be answered without consulting the visual information, making the “V” in ArtVQA rather insignificant. In order to counter this problem, we create a simple, yet practical dataset, ArtQuest, using structured information from the SemArt collection. Our dataset and the pipeline to reproduce our results are publicly available at https://github.com/bletib/artquest. |
Srividya
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3491
-
Boer 2024
Harnessing the Power of Semi-Structured Knowledge and LLMs with Triplet-Based Prefiltering for Question Answering
arXiv 2024;(): 2024 Ref ID: 8575 Large Language Models (LLMs) frequently lack domain-specific knowledge and even fine-tuned models tend to hallucinate. Hence, more reliable models that can include external knowledge are needed. We present a pipeline, 4StepFocus, and specifically a preprocessing step, that can substantially improve the answers of LLMs. This is achieved by providing guided access to external knowledge making use of the model's ability to capture relational context and conduct rudimentary reasoning by themselves. The method narrows down potentially correct answers by triplets-based searches in a semi-structured knowledge base in a direct, traceable fashion, before switching to latent representations for ranking those candidates based on unstructured data. This distinguishes it from related methods that are purely based on latent representations. 4StepFocus consists of the steps: 1) Triplet generation for extraction of relational data by an LLM, 2) substitution of variables in those triplets to narrow down answer candidates employing a knowledge graph, 3) sorting remaining candidates with a vector similarity search involving associated non-structured data, 4) reranking the best candidates by the LLM with background data provided. Experiments on a medical, a product recommendation, and an academic paper search test set demonstrate that this approach is indeed a powerful augmentation. It not only adds relevant traceable background information from information retrieval, but also improves performance considerably in comparison to state-of-the-art methods. This paper presents a novel, largely unexplored direction and therefore provides a wide range of future work opportunities. Used source code is available at https://github.com/kramerlab/4StepFocus. |
Davis
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#799
-
Bombieri 2024
Surgicberta: a pre-trained language model for procedural surgical language
Pre-trained language models are now ubiquitous in natural language processing, being successfully applied for many different tasks and in several real-world applications. However, even though there is a wealth of high-quality written materials on surgery, and the scientific community has shown a growing interest in the application of natural language processing techniques in surgery, a pre-trained language model specific to the surgical domain is still missing. The creation and public release of such a model would serve numerous useful clinical applications. For example, it could enhance existing surgical knowledge bases employed for task automation, or assist medical students in summarizing complex surgical descriptions. For this reason, in this paper, we introduce SurgicBERTa, a pre-trained language model specific for the English surgical language, i.e., the language used in the surgical domain. SurgicBERTa has been obtained from RoBERTa through continued pre-training with the Masked language modeling objective on 300 k sentences taken from English surgical books and papers, for a total of 7 million words. By publicly releasing SurgicBERTa, we make available a resource built from the content collected in many high-quality surgical books, online textual resources, and academic papers. We performed several assessments in order to evaluate SurgicBERTa, comparing it with the general domain RoBERTa. First, we intrinsically assessed the model in terms of perplexity, accuracy, and evaluation loss resulting from the continual training according to the masked language modeling task. Then, we extrinsically evaluated SurgicBERTa on several downstream tasks, namely (i) procedural sentence detection, (ii) procedural knowledge extraction, (iii) ontological information discovery, and (iv) surgical terminology acquisition. Finally, we conducted some qualitative analysis on SurgicBERTa, showing that it contains a lot of surgical knowledge that could be useful to enrich existing state-of-the-art surgical knowledge bases or to extract surgical knowledge. All the assessments show that SurgicBERTa better deals with surgical language than a general-purpose pre-trained language model such as RoBERTa, and therefore can be effectively exploited in many computer-assisted applications in the surgical domain. |
Davis
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2978
-
Bork 2018
Systematic analysis and evaluation of visual conceptual modeling language notations
2018 12th International Conference on Research Challenges in Information Science (RCIS) 2018;():1-11 2018 DOI: 10.1109/RCIS.2018.8406652 · Ref ID: 6377 In systems analysis and design it is common to refer to some widely used de-facto industry standards like Unified Modeling Language (UML) and Business Process Model and Notation (BPMN). Albeit the wide adoption of such standard modeling languages, only limited research focuses on the techniques in which these standards are specified and the quality they provide. Most research focuses on case studies of applying standards, ways of extending standards to domain-specific requirements, e.g., by means of profiling, or evaluations of single modeling languages, e.g., using questionnaires or semiotic theories. By contrast, this paper critically reflects on the current state of modeling standards with a focus on their graphical representation (notation). The contribution of this paper is threefold: First, a systematic analysis is performed thereby investigating how different modeling standards specify notational aspects. Second, an evaluation is performed by applying Moody's Physics of Notation theory to the identified standards. Third, based on the findings, recommendations are given to improve modeling standard specifications in the future w.r.t. their notational aspects. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1639
-
Boscariol 2024
A METHODOLOGICAL APPROACH TO ASSET INFORMATION MANAGEMENT VIA KNOWLEDGE GRAPHS AND LARGE LANGUAGE MODELS
Proceedings of the European Conference on Computing in Construction 2024;2024():404-411 European Council on Computing in Construction (EC3) 2024 DOI: 10.35490/EC3.2024.286 · Ref ID: 4361 Tackling the need of large organizations for a proactive Asset Information Management (AIM) System, a methodological approach to knowledge management applied to built assets portfolios is proposed. It aims at synergically leveraging Knowledge Graphs (KGs) and Artificial Intelligence (AI) technologies to enable analytics on input data. In the theorized pipeline Large Language Models (LLMs) are meant to be used both in the graph creation phase, extracting data from unstructured sources and organizing them according to domain ontologies, as tested on a use-case sample, and in the knowledge extraction phase via queries. © 2024 European Council on Computing in Construction. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#122
-
Bosselut 2019
COMET(sic): Commonsense Transformers for Automatic Knowledge Graph Construction
57th Annual Meeting of the Association-for-Computational-Linguistics (ACL) 2019;():4762-4779 Florence, ITALY Assoc Computational Linguistics-Acl 2019 Ref ID: 3052 We present the first comprehensive study on automatic knowledge base construction for two prevalent commonsense knowledge graphs: ATOMIC (Sap et al., 2019) and ConceptNet (Speer et al., 2017). Contrary to many conventional 103s that store knowledge with canonical templates, commonsense KBs only store loosely structured open-text descriptions of knowledge. We posit that an important step toward automatic commonsense completion is the development of generative models of commonsense knowledge, and propose COMmonsEnse Transformers (COMET(sic)) that learn to generate rich and diverse conunonsense descriptions in natural language. Despite the challenges of commonsense modeling, our investigation reveals promising results when implicit knowledge from deep pre-trained language models is transferred to generate explicit knowledge in commonsense knowledge graphs. Empirical results demonstrate that COMET is able to generate novel knowledge that humans rate as high quality, with up to 77.5% (ATOMIC) and 91.7% (ConceptNet) precision at top 1, which approaches human performance for these resources. Our findings suggest that using generative commonsense models for automatic commonsense KB completion could soon be a plausible alternative to extractive methods. |
yuexi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1735
-
Boudin 2010
Positional language models for clinical information retrieval
EMNLP 2010 - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference 2010;():108-115 2010 Ref ID: 5815 The PECO framework is a knowledge representation for formulating clinical questions. Queries are decomposed into four aspects, which are Patient-Problem (P), Exposure (E), Comparison (C) and Outcome (O). However, no test collection is available to evaluate such framework in information retrieval. In this work, we first present the construction of a large test collection extracted from systematic literature reviews. We then describe an analysis of the distribution of PECO elements throughout the relevant documents and propose a language modeling approach that uses these distributions as a weighting strategy. In our experiments carried out on a collection of 1.5 million documents and 423 queries, our method was found to lead to an improvement of 28% in MAP and 50% in P@5, as compared to the state-of-the-art method. © 2010 Association for Computational Linguistics. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#521
-
Bouzid 2024
Leveraging Generative AI in Short Document Indexing
The efficiency of information retrieval systems primarily depends on the effective representation of documents during query processing. This representation is mainly constructed from relevant document terms identified and selected during their indexing, which are then used for retrieval. However, when documents contain only a few features, such as in short documents, the resulting representation may be information-poor due to a lack of index terms and their lack of relevance. Although document representation can be enriched using techniques like word embeddings, these techniques require large pre-trained datasets, which are often unavailable in the context of domain-specific short documents. This study investigates a new approach to enrich document representation during indexing using generative AI. In the proposed approach, relevant terms extracted from documents and preprocessed for indexing are enriched with a list of key terms suggested by a large language model (LLM). After conducting a small benchmark of several renowned LLM models for key term suggestions from a set of short texts, the GPT-4o model was chosen to experiment with the proposed indexing approach. The findings of this study yielded notable results, demonstrating that generative AI can efficiently fill the knowledge gap in document representation, regardless of the retrieval technique used. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3960
-
Bronzini 2024
Unveiling LLMs: The Evolution of Latent Representations in a Dynamic Knowledge Graph
arXiv 2024;(): 2024 Ref ID: 8218 Large Language Models (LLMs) demonstrate an impressive capacity to recall a vast range of factual knowledge. However, understanding their underlying reasoning and internal mechanisms in exploiting this knowledge remains a key research area. This work unveils the factual information an LLM represents internally for sentence-level claim verification. We propose an end-to-end framework to decode factual knowledge embedded in token representations from a vector space to a set of ground predicates, showing its layer-wise evolution using a dynamic knowledge graph. Our framework employs activation patching, a vector-level technique that alters a token representation during inference, to extract encoded knowledge. Accordingly, we neither rely on training nor external models. Using factual and common-sense claims from two claim verification datasets, we showcase interpretability analyses at local and global levels. The local analysis highlights entity centrality in LLM reasoning, from claim-related information and multi-hop reasoning to representation errors causing erroneous evaluation. On the other hand, the global reveals trends in the underlying evolution, such as word-based knowledge evolving into claim-related facts. By interpreting semantics from LLM latent representations and enabling graph-related analyses, this work enhances the understanding of the factual knowledge resolution process. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#553
-
Buehler 2024
MechGPT, a Language-Based Strategy for Mechanics and Materials Modeling That Connects Knowledge Across Scales, Disciplines, and Modalities
For centuries, researchers have sought out ways to connect disparate areas of knowledge. While early scholars (Galileo, da Vinci, etc.) were experts across fields, specialization took hold later. With the advent of Artificial Intelligence, we can now explore relationships across areas (e.g., mechanics-biology) or disparate domains (e.g., failure mechanics-art). To achieve this, we use a fine-tuned large language model (LLM), here for a subset of knowledge in multiscale materials failure. The approach includes the use of a general-purpose LLM to distill question-answer pairs from raw sources followed by LLM fine-tuning. The resulting MechGPT LLM foundation model is used in a series of computational experiments to explore its capacity for knowledge retrieval, various language tasks, hypothesis generation, and connecting knowledge across disparate areas. While the model has some ability to recall knowledge from training, we find that LLMs are particularly useful for extracting structural insights through Ontological Knowledge Graphs. These interpretable graph structures provide explanatory insights, frameworks for new research questions, and visual representations of knowledge that also can be used in retrieval-augmented generation. Three versions of MechGPT are discussed, featuring different sizes from 13 x 109 to 70 x 109 parameters, and reaching context lengths of more than 10,000 tokens. This provides ample capacity for sophisticated retrieval augmented strategies, as well as agent-based modeling where multiple LLMs interact collaboratively and/or adversarially, the incorporation of new data from the literature or web searches, as well as multimodality. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#310
-
Buehler 2024
Generative Retrieval-Augmented Ontologic Graph and Multiagent Strategies for Interpretive Large Language Model-Based Materials Design
Transformer neural networks show promising capabilities, in particular for uses in materials analysis, design, and manufacturing, including their capacity to work effectively with human language, symbols, code, and numerical data. Here, we explore the use of large language models (LLMs) as a tool that can support engineering analysis of materials, applied to retrieving key information about subject areas, developing research hypotheses, discovery of mechanistic relationships across disparate areas of knowledge, and writing and executing simulation codes for active knowledge generation based on physical ground truths. Moreover, when used as sets of AI agents with specific features, capabilities, and instructions, LLMs can provide powerful problem-solution strategies for applications in analysis and design problems. Our experiments focus on using a fine-tuned model, MechGPT, developed based on training data in the mechanics of materials domain. We first affirm how fine-tuning endows LLMs with a reasonable understanding of subject area knowledge. However, when queried outside the context of learned matter, LLMs can have difficulty recalling correct information and may hallucinate. We show how this can be addressed using retrieval-augmented Ontological Knowledge Graph strategies. The graph-based strategy helps us not only to discern how the model understands what concepts are important but also how they are related, which significantly improves generative performance and also naturally allows for injection of new and augmented data sources into generative AI algorithms. We find that the additional feature of relatedness provides advantages over regular retrieval augmentation approaches and not only improves LLM performance but also provides mechanistic insights for exploration of a material design process. Illustrated for a use case of relating distinct areas of knowledge, here, music and proteins, such strategies can also provide an interpretable graph structure with rich information at the node, edge, and subgraph level that provides specific insights into mechanisms and relationships. We discuss other approaches to improve generative qualities, including nonlinear sampling strategies and agent-based modeling that offer enhancements over single-shot generations, whereby LLMs are used to both generate content and assess content against an objective target. Examples provided include complex question answering, code generation, and execution in the context of automated force-field development from actively learned density functional theory (DFT) modeling and data analysis. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3777
-
Buehler 2024
PRefLexOR: Preference-based Recursive Language Modeling for Exploratory Optimization of Reasoning and Agentic Thinking
arXiv 2024;(): 2024 Ref ID: 8716 PRefLexOR (Preference-based Recursive Language Modeling for Exploratory Optimization of Reasoning) combines preference optimization with concepts from Reinforcement Learning to enable models to self-teach through iterative reasoning improvements. We propose a recursive learning approach that engages the model in multi-step reasoning, revisiting, and refining intermediate steps before producing a final output in training and inference phases. Through multiple training stages, the model first learns to align its reasoning with accurate decision paths by optimizing the log odds between preferred and non-preferred responses. During this process, PRefLexOR builds a dynamic knowledge graph by generating questions from random text chunks and retrieval-augmentation to contextualize relevant details from the entire training corpus. In the second stage, preference optimization enhances model performance by using rejection sampling to fine-tune reasoning quality by continually producing in-situ training data while masking the reasoning steps. Recursive optimization within a thinking token framework introduces iterative feedback loops, where the model refines reasoning, achieving deeper coherence, consistency, and adaptability. Implemented in small language models with only 3 billion parameters, we should that even tiny models can iteratively teach themselves to reason with greater depth and reflectivity. Our implementation is straightforward and can be incorporated into any existing pretrained LLM. We focus our examples on applications in biological materials science and demonstrate the method in a variety of case studies that range from in-domain to cross-domain applications. Using reasoning strategies that include thinking and reflection modalities we build a multi-agent recursive self-improving inference approach to successively improve responses via repeated sampling in inference time. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#150
-
Bui 2024
Cross-Data Knowledge Graph Construction for LLM-enabled Educational Question-Answering System: A Case Study at HCMUT
1st Workshop on AI-powered Question Answering Systems for Multimedia (AIQAM) 2024;():36-41 Phuket, THAILAND Assoc Computing Machinery 2024 DOI: 10.1145/3643479.3662055 · Ref ID: 3013 In today's rapidly evolving landscape of Artificial Intelligence, large language models (LLMs) have emerged as a vibrant research topic. LLMs find applications in various fields and contribute significantly. Despite their powerful language capabilities, similar to pre-trained language models (PLMs), LLMs still face challenges in remembering events, incorporating new information, and addressing domain-specific issues or hallucinations. To overcome these limitations, researchers have proposed Retrieval-Augmented Generation (RAG) techniques, some others have proposed the integration of LLMs with Knowledge Graphs (KGs) to provide factual context, thereby improving performance and delivering more accurate feedback to user queries. Education plays a crucial role in human development and progress. With the technology transformation, traditional education is being replaced by digital or blended education. Therefore, educational data in the digital environment is increasing day by day. Data in higher education institutions are diverse, comprising various sources such as unstructured/structured text, relational databases, web/app-based API access, etc. Constructing a Knowledge Graph from these cross-data sources is not a simple task. This article proposes a method for automatically constructing a Knowledge Graph from multiple data sources and discusses some initial applications (experimental trials) of KG in conjunction with LLMs for question-answering tasks. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2534
-
Buluç 2013
High-Productivity and High-Performance Analysis of Filtered Semantic Graphs
2013 IEEE 27th International Symposium on Parallel and Distributed Processing 2013;():237-248 2013 DOI: 10.1109/IPDPS.2013.52 · Ref ID: 6337 High performance is a crucial consideration when executing a complex analytic query on a massive semantic graph. In a semantic graph, vertices and edges carry attributes of various types. Analytic queries on semantic graphs typically depend on the values of these attributes; thus, the computation must view the graph through a filter that passes only those individual vertices and edges of interest. Knowledge Discovery Toolbox (KDT), a Python library for parallel graph computations, is customizable in two ways. First, the user can write custom graph algorithms by specifying operations between edges and vertices. These programmer-specified operations are called semiring operations due to KDT's underlying linear-algebraic abstractions. Second, the user can customize existing graph algorithms by writing filters that return true for those vertices and edges the user wants to retain during algorithm execution. For high productivity, both semiring operations and filters are written in a high-level language, resulting in relatively low performance due to the bottleneck of having to call into the Python virtual machine for each vertex and edge. In this work, we use the Selective Embedded JIT Specialization (SEJITS) approach to automatically translate semiring operations and filters defined by programmers into a lower-level efficiency language, bypassing the upcall into Python. We evaluate our approach by comparing it with the high-performance Combinatorial BLAS engine, and show our approach enables users to write in high-level languages and still obtain the high performance of low-level code. We also present a new roofline model for graph traversals, and show that our high-performance implementations do not significantly deviate from the roofline. Overall, we demonstrate the first known solution to the problem of obtaining high performance from a productivity language when applying graph algorithms selectively on semantic graphs. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2579
-
Bunte 2016
Integrating semantics for diagnosis of manufacturing systems
2016 IEEE 21st International Conference on Emerging Technologies and Factory Automation (ETFA) 2016;():1-8 2016 DOI: 10.1109/ETFA.2016.7733721 · Ref ID: 6859 Trends in novel manufacturing systems lead to an increased level of data availability and smart usage of these data. Nowadays, many approaches are available to use the data, but because of an increased flexibility of the systems the interaction between machines and humans has become a challenge. Humans have to browse through a huge amount of data, need knowledge about the machine and underlying algorithms to interpret the results; they cannot use their known terms for communication, we call it the conceptual gap. The user should be enabled to communicate with the machine on a more abstract level and in a more natural way. Therefore, a natural language layer is introduced to provide users with a familiar interaction interface. Underlying layers contain knowledge about the domain, the machines and how data can be accessed and processed. This enables users' questions such as “Are there any anomalies in the system?” to be answered. Answers are provided in natural language and evaluated with a test set of 204 questions. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1589
-
Buongiorno 2024
Leveraging Gaming to Enhance Knowledge Graphs for Explainable Generative AI Applications
IEEE Conference on Computatonal Intelligence and Games, CIG 2024;(): IEEE Computer Society 2024 DOI: 10.1109/CoG60054.2024.10645673 · Ref ID: 4402 External knowledge graphs (KGs) can be used to augment large language models (LLMs), while simultaneously providing an explainable knowledge base of facts that can be inspected by a human. This approach may be particularly valuable in domains where explainability is critical, like human trafficking data analysis. However, creating KGs can pose challenges. KGs parsed from documents may comprise explicit connections (those directly stated by a document) but miss implicit connections (those obvious to a human although not directly stated). To address these challenges, this preliminary research introduces the GAME-KG framework, standing for 'Gaming for Augmenting Metadata and Enhancing Knowledge Graphs.' GAME-KG is a federated approach to modifying explicit as well as implicit connections in KGs by using crowdsourced feedback collected through video games. GAME-KG is shown through two demonstrations: a Unity test scenario from Dark Shadows, a video game that collects feedback on KGs parsed from US Department of Justice (DOJ) Press Releases on human trafficking, and a following experiment where OpenAI's GPT-4 is prompted to answer questions based on a modified and unmodified KG. Initial results suggest that GAME-KG can be an effective framework for enhancing KGs while simultaneously providing an explainable set of structured facts verified by humans. © 2024 IEEE. |
Davis
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2504
-
Buyko 2011
Generating Semantics for the Life Sciences via Text Analytics
2011 IEEE Fifth International Conference on Semantic Computing 2011;():193-196 2011 DOI: 10.1109/ICSC.2011.75 · Ref ID: 6227 The life sciences have a strong need for carefully curated, semantically rich fact repositories. Knowledge harvesting from unstructured textual sources is currently performed by highly skilled curators who manually feed semantics into such databases as a result of deep understanding of the documents chosen to populate such repositories. As this is a slow and costly process, we here advocate an automatic approach to the generation of database contents which is based on JREX, a high performance relation extraction system. As a real-life example, we target REGULONDB, the world's largest manually curated reference database for the transcriptional regulation network of E. coli. We investigate in our study the performance of automatic knowledge capture from various literature sources, such as PUBMED abstracts and associated full text articles. Our results show that we can, indeed, automatically re-create a considerable portion of the REGULONDB database by processing the relevant literature sources. Hence, this approach might help curators widen the knowledge acquisition bottleneck in this field. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1008
-
Buzzega 2023
Automated Knowledge Graph Completion for Natural Language Understanding: Known Paths and Future Directions
CEUR Workshop Proceedings 2023;3478():160-172 CEUR-WS 2023 Ref ID: 5248 Knowledge Graphs (KGs) are large collections of structured data that can model real world knowledge and are important assets for the companies that employ them. KGs are usually constructed iteratively and often show a sparse structure. Also, as knowledge evolves, KGs must be updated and completed. Many automatic methods for KG Completion (KGC) have been proposed in the literature to reduce the costs associated with manual maintenance. Motivated by an industrial case study aiming to enrich a KG specifically designed for Natural Language Understanding tasks, this paper presents an overview of classical and modern deep learning completion methods. In particular, we delve into Large Language Models (LLMs), which are the most promising deep learning architectures. We show that their applications to KGC are affected by several shortcomings, namely they neglect the structure of KG and treat KGC as a classification problem. Such limitations, together with the brittleness of the LLMs themselves, stress the need to create KGC solutions at the interface between symbolic and neural approaches and lead to the way ahead for future research in intelligible corpus-based KGC. © 2023 CEUR-WS. All rights reserved. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#129
-
Cadeddu 2024
A comparative analysis of knowledge injection strategies for large language models in the domain
In recent years, transformer-based models have emerged as powerful tools for natural language processing tasks, demonstrating remarkable performance in several domains. However, they still present significant limitations. These shortcomings become more noticeable when dealing with highly specific and complex concepts, particularly within the scientific domain. For example, transformer models have particular difficulties when processing scientific articles due to the domain-specific terminologies and sophisticated ideas often encountered in scientific literature. To overcome these challenges and further enhance the effectiveness of transformers in specific fields, researchers have turned their attention to the concept of knowledge injection. Knowledge injection is the process of incorporating outside knowledge into transformer models to improve their performance on certain tasks. In this paper, we present a comprehensive study of knowledge injection strategies for transformers within the scientific domain. Specifically, we provide a detailed overview and comparative assessment of four primary methodologies, evaluating their efficacy in the task of classifying scientific articles. For this purpose, we constructed a new benchmark including both 24K labelled papers and a knowledge graph of 9.2K triples describing pertinent research topics. We also developed a full codebase to easily re-implement all knowledge injection strategies in different domains. A formal evaluation indicates that the majority of the proposed knowledge injection methodologies significantly outperform the baseline established by Bidirectional Encoder Representations from Transformers. |
yuexi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1106
-
Cadeddu 2024
A comparative analysis of knowledge injection strategies for large language models in the scholarly domain
In recent years, transformer-based models have emerged as powerful tools for natural language processing tasks, demonstrating remarkable performance in several domains. However, they still present significant limitations. These shortcomings become more noticeable when dealing with highly specific and complex concepts, particularly within the scientific domain. For example, transformer models have particular difficulties when processing scientific articles due to the domain-specific terminologies and sophisticated ideas often encountered in scientific literature. To overcome these challenges and further enhance the effectiveness of transformers in specific fields, researchers have turned their attention to the concept of knowledge injection. Knowledge injection is the process of incorporating outside knowledge into transformer models to improve their performance on certain tasks. In this paper, we present a comprehensive study of knowledge injection strategies for transformers within the scientific domain. Specifically, we provide a detailed overview and comparative assessment of four primary methodologies, evaluating their efficacy in the task of classifying scientific articles. For this purpose, we constructed a new benchmark including both 24K labelled papers and a knowledge graph of 9.2K triples describing pertinent research topics. We also developed a full codebase to easily re-implement all knowledge injection strategies in different domains. A formal evaluation indicates that the majority of the proposed knowledge injection methodologies significantly outperform the baseline established by Bidirectional Encoder Representations from Transformers. © 2024 |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#196
-
Cai 2024
Editing Knowledge Representation of Language Model via Rephrased Prefix Prompts
20th International Conference on Intelligent Computing (ICIC) 2024;14878():459-470 Tianjin Univ Sci & Tech, Tianjin, PEOPLES R CHINA Springer-Verlag Singapore Pte Ltd 2024 DOI: 10.1007/978-981-97-5672-8_39 · Ref ID: 3658 Neural language models (LMs) have been extensively trained on vast corpora to store factual knowledge about various aspects of the world described in texts. Current technologies typically employ knowledge editing methods or specific prompts to modify LM outputs. However, existing knowledge editing methods are costly and inefficient, struggling to produce appropriate text. Additionally, prompt engineering is opaque and requires significant effort to find suitable prompts. To address these issues, we introduce a new method called PSPEM (Prefix Soft-Prompt Editing Method), that can be used for a lifetime with just one training. It resolves the inefficiencies and generalizability issues in knowledge editing methods and overcomes the opacity of prompt engineering by automatically seeking optimal soft prompts. Specifically, PSPEM adopts a prompt encoder and an encoding converter to compress and refine key information in prompts and adopts prompt alignment techniques to guide model generation, ensuring text consistency and adherence to the intended structure and content. We have validated the effectiveness of PSPEM through knowledge editing and attribute inserting. On the COUNTERFACT dataset, PSPEM achieved nearly 100% editing accuracy and demonstrated the highest level of fluency. We further analyzed the similarities between PSPEM and original prompts and their impact on the model's internals. The results indicate that PSPEM can serve as an alternative to original prompts, supporting the model in effective editing. |
Mike
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2037
-
Calixto 2021
Wikipedia Entities as Rendezvous across Languages: Grounding Multilingual Language Models by Predicting Wikipedia Hyperlinks
NAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference 2021;():3651-3661 Association for Computational Linguistics (ACL) 2021 Ref ID: 5700 Masked language models have quickly become the de facto standard when processing text. Recently, several approaches have been proposed to further enrich word representations with external knowledge sources such as knowledge graphs. However, these models are devised and evaluated in a monolingual setting only. In this work, we propose a language-independent entity prediction task as an intermediate training procedure to ground word representations on entity semantics and bridge the gap across different languages by means of a shared vocabulary of entities. We show that our approach effectively injects new lexical-semantic knowledge into neural models, improving their performance on different semantic tasks in the zero-shot crosslingual setting. As an additional advantage, our intermediate training does not require any supplementary input, allowing our models to be applied to new datasets right away. In our experiments, we use Wikipedia articles in up to 100 languages and already observe consistent gains compared to strong baselines when predicting entities using only the English Wikipedia. Further adding extra languages lead to improvements in most tasks up to a certain point, but overall we found it non-trivial to scale improvements in model transferability by training on ever increasing amounts of Wikipedia languages. © 2021 Association for Computational Linguistics. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3062
-
Calvagna 2023
Using Knowledge Awareness to Improve Safety of Autonomous Driving
2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC) 2023;():2997-3002 2023 DOI: 10.1109/SMC53992.2023.10394593 · Ref ID: 6916 We present a method, which incorporates knowledge awareness into the symbolic computation of discrete controllers for reactive cyber physical systems, to improve decision making about the unknown operating environment under uncertain/incomplete inputs. Assuming an abstract model of the system and the environment, we translate the knowledge awareness of the operating context into linear temporal logic formulas and incorporate them into the system specifications to synthesize a controller. The knowledge base is built upon an ontology model of the environment objects and behavioural rules, which includes also symbolic models of partial input features. The resulting symbolic controller support smoother, early reactions, which improves the security of the system over existing approaches based on incremental symbolic perception. A motion planning case study for an autonomous vehicle has been implemented to validate the approach, and presented results show significant improvements with respect to safety of state-of-the-art symbolic controllers for reactive systems. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1821
-
Cao 2024
Retentive or Forgetful? Diving into the Knowledge Memorizing Mechanism of Language Models
2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():14016-14036 European Language Resources Association (ELRA) 2024 Ref ID: 4655 Memory is one of the most essential cognitive functions serving as a repository of world knowledge and episodes of activities. In recent years, large-scale pre-trained language models have shown remarkable memorizing ability. On the contrary, vanilla neural networks without pre-training have been long observed suffering from the catastrophic forgetting problem. To investigate such a retentive-forgetful contradiction and understand the memorizing dynamic mechanism of language models, we conduct thorough experiments by controlling the target knowledge types, the learning strategies and the learning schedules. We find that: 1) Vanilla language models without pre-training are forgetful; 2) Pre-training leads to retentive language models; 3) Knowledge relevance and diversification significantly influence the memory formation. These conclusions are useful for understanding the abilities of pre-trained language models and shed light on designing and evaluating new learning and inference algorithms of language models. © 2024 ELRA Language Resource Association: CC BY-NC 4.0. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#477
-
Cao 2021
Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases
Joint Conference of 59th Annual Meeting of the Association-for-Computational-Linguistics (ACL) / 11th International Joint Conference on Natural Language Processing (IJCNLP) / 6th Workshop on Representation Learning for NLP (RepL4NLP) 2021;():1860-1874 Electr Network Assoc Computational Linguistics-Acl 2021 Ref ID: 3621 Previous literatures show that pre-trained masked language models (MLMs) such as BERT can achieve competitive factual knowledge extraction performance on some datasets, indicating that MLMs can potentially be a reliable knowledge source. In this paper, we conduct a rigorous study to explore the underlying predicting mechanisms of MLMs over different extraction paradigms. By investigating the behaviors of MLMs, we find that previous decent performance mainly owes to the biased prompts which overfit dataset artifacts. Furthermore, incorporating illustrative cases and external contexts improve knowledge prediction mainly due to entity type guidance and golden answer leakage. Our findings shed light on the underlying predicting mechanisms of MLMs, and strongly question the previous conclusion that current MLMs can potentially serve as reliable factual knowledge bases(1). |
Xinchen
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3212
-
Cao 2024
AutoRD: An Automatic and End-to-End System for Rare Disease Knowledge Graph Construction Based on Ontologies-enhanced Large Language Models
arXiv 2024;(): 2024 Ref ID: 8151 Rare diseases affect millions worldwide but often face limited research focus due to their low prevalence. This results in prolonged diagnoses and a lack of approved therapies. Recent advancements in Large Language Models (LLMs) have shown promise in automating the extraction of medical information, offering potential to improve medical diagnosis and management. However, most LLMs lack professional medical knowledge, especially concerning rare diseases, and struggle to handle the latest rare disease information. They also cannot effectively manage rare disease data and are not directly suitable for diagnosis and management tasks. Our objective is to create an end-to-end system called AutoRD, which automates the extraction of information from medical texts about rare diseases, focusing on entities and their relations. AutoRD integrates up-to-date structured knowledge and demonstrates superior performance in rare disease extraction tasks. We conduct various experiments to evaluate AutoRD's performance, aiming to surpass common LLMs and traditional methods. |
Xinchen
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#722
-
Cao 2024
Research on Large Language Model for Coal Mine Equipment Maintenance Based on Multi-Source Text
The efficient management and utilization of coal mine equipment maintenance knowledge is an indispensable foundation for advancing the establishment of intelligent mines. This knowledge has problems such as scattered, low sharing, and insufficient management, which restricts the development of coal mine intelligence. For the above-mentioned problems, a large language model for the maintenance of coal mine equipment based on multi-source text (XCoalChat) was proposed to better manage and utilize the existing massive knowledge of coal mine equipment maintenance. The dataset of coal mine equipment maintenance based on ReliableCEMK-Self-Instruction was constructed to obtain a wide and diverse amount of knowledge through sample generation. Aiming at the illusory problem of the large language model, a knowledge graph enhancement method based on the "Coal Mine Equipment Maintenance System-Full Life Cycle-Specification" was proposed to improve the knowledge density. A triple-LoRA fine-tuning mechanism and DPO direct preference optimization method were introduced into the top of the baseline model, which guarantees that XCoalChat can handle multiple Q&A and maintenance decision analysis tasks with limited computing power. Compared with ChatGLM, Bloom, and LLama, the comprehensive assessment of XCoalChat was performed by experiments including coal mine dialog consulting, coal mine professional consulting, and maintenance decision analysis. The results showed that XCoalChat achieved the best response accuracy in professional consulting and maintenance decision analysis; XCoalChat also took the least reasoning time on average. XCoalChat outperformed other mainstream large language models, which verify that XCoalChat is an effective large language model in the field of coal mine equipment maintenance. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#840
-
Carta 2024
Towards Zero-shot Knowledge Graph building: Automated Schema Inference
32nd ACM Conference on User Modeling, Adaptation and Personalization (ACM UMAP) 2024;():467-473 Cagliari, ITALY Assoc Computing Machinery 2024 DOI: 10.1145/3631700.3665234 · Ref ID: 3324 In the current Digital Transformation scenario, Knowledge Graphs are essential for comprehending, representing, and exploiting complex information in a structured form. The main paradigm in automatically generating proper Knowledge Graphs relies on predefined schemas or ontologies. Such schemas are typically manually constructed, requiring an intensive human effort, and are often sensitive to information loss due to negligence, incomplete analysis, or human subjectivity or inclination. Limiting human bias and the resulting information loss in creating proper Knowledge Graphs is paramount, particularly for user modeling in various sectors, such as education or healthcare. To this end, we propose a novel approach to automatically generating a proper entity schema. The devised methodology combines the language understanding capabilities of LLM with classical machine learning methods such as clustering to properly build an entity schema from a set of documents. This solution eliminates the need for human intervention and fosters a more efficient and comprehensive knowledge representation. The assessment of our proposal concerns adopting a state-of-the-art entity extraction model ( UniNER) to estimate the relevance of the extracted entities based on the generated schema. Results confirm the potential of our approach, as we observed a negligible difference between the topic similarity score obtained with the ground truth and with the automatically generated schema (less than 1% on average on three different datasets). Such an outcome confirms that the proposed approach may be valuable in automatically creating an entity schema from a set of documents. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3554
-
Carta 2023
Iterative Zero-Shot LLM Prompting for Knowledge Graph Construction
arXiv 2023;(): 2023 Ref ID: 7771 In the current digitalization era, capturing and effectively representing knowledge is crucial in most real-world scenarios. In this context, knowledge graphs represent a potent tool for retrieving and organizing a vast amount of information in a properly interconnected and interpretable structure. However, their generation is still challenging and often requires considerable human effort and domain expertise, hampering the scalability and flexibility across different application fields. This paper proposes an innovative knowledge graph generation approach that leverages the potential of the latest generative large language models, such as GPT-3.5, that can address all the main critical issues in knowledge graph building. The approach is conveyed in a pipeline that comprises novel iterative zero-shot and external knowledge-agnostic strategies in the main stages of the generation process. Our unique manifold approach may encompass significant benefits to the scientific community. In particular, the main contribution can be summarized by: (i) an innovative strategy for iteratively prompting large language models to extract relevant components of the final graph; (ii) a zero-shot strategy for each prompt, meaning that there is no need for providing examples for "guiding" the prompt result; (iii) a scalable solution, as the adoption of LLMs avoids the need for any external resources or human expertise. To assess the effectiveness of our proposed model, we performed experiments on a dataset that covered a specific domain. We claim that our proposal is a suitable solution for scalable and versatile knowledge graph construction and may be applied to different and novel contexts. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#798
-
Castell-Díaz 2023
Supporting SNOMED CT postcoordination with knowledge graph embeddings
SNOMED CT postcoordination is an underused mechanism that can help to implement advanced systems for the automatic extraction and encoding of clinical information from text. It allows defining non-existing SNOMED CT concepts by their relationships with existing ones. Manually building postcoordinated expressions is a difficult task. It requires a deep knowledge of the terminology and the support of specialized tools that barely exist. In order to support the building of postcoordinated expressions, we have implemented KGE4SCT: a method that suggests the corresponding SNOMED CT postcoordinated expression for a given clinical term. We leverage on the SNOMED CT ontology and its graph-like structure and use knowledge graph embeddings (KGEs). The objective of such embeddings is to represent in a vector space knowledge graph components (e.g. entities and relations) in a way that captures the structure of the graph. Then, we use vector similarity and analogies for obtaining the postcoordinated expression of a given clinical term. We obtained a semantic type accuracy of 98%, relationship accuracy of 90%, and analogy accuracy of 60%, with an overall completeness of postcoordination of 52% for the Spanish SNOMED CT version. We have also applied it to the English SNOMED CT version and outperformed state of the art methods in both, corpus generation for language model training for this task (improvement of 6% for analogy accuracy), and automatic postcoordination of SNOMED CT expressions, with an increase of 17% for partial conversion rate. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#792
-
Caufield 2024
Structured Prompt Interrogation and Recursive Extraction of Semantics (SPIRES): a method for populating knowledge bases using zero-shot learning
Motivation Creating knowledge bases and ontologies is a time consuming task that relies on manual curation. AI/NLP approaches can assist expert curators in populating these knowledge bases, but current approaches rely on extensive training data, and are not able to populate arbitrarily complex nested knowledge schemas.Results Here we present Structured Prompt Interrogation and Recursive Extraction of Semantics (SPIRES), a Knowledge Extraction approach that relies on the ability of Large Language Models (LLMs) to perform zero-shot learning and general-purpose query answering from flexible prompts and return information conforming to a specified schema. Given a detailed, user-defined knowledge schema and an input text, SPIRES recursively performs prompt interrogation against an LLM to obtain a set of responses matching the provided schema. SPIRES uses existing ontologies and vocabularies to provide identifiers for matched elements. We present examples of applying SPIRES in different domains, including extraction of food recipes, multi-species cellular signaling pathways, disease treatments, multi-step drug mechanisms, and chemical to disease relationships. Current SPIRES accuracy is comparable to the mid-range of existing Relation Extraction methods, but greatly surpasses an LLM's native capability of grounding entities with unique identifiers. SPIRES has the advantage of easy customization, flexibility, and, crucially, the ability to perform new tasks in the absence of any new training data. This method supports a general strategy of leveraging the language interpreting capabilities of LLMs to assemble knowledge bases, assisting manual knowledge curation and acquisition while supporting validation with publicly-available databases and ontologies external to the LLM.Availability and implementation SPIRES is available as part of the open source OntoGPT package: https://github.com/monarch-initiative/ontogpt. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#172
-
Celik 2023
Developmental Scaffolding with Large Language Models
IEEE International Conference on Development and Learning (ICDL) 2023;():396-402 Macau, PEOPLES R CHINA Ieee 2023 DOI: 10.1109/icdl55364.2023.10364374 · Ref ID: 3634 Exploration and self-observation are key mechanisms of infant sensorimotor development. These processes are further guided by parental scaffolding to accelerate skill and knowledge acquisition. In developmental robotics, this approach has been adopted often by having a human acting as the source of scaffolding. In this study, we investigate whether Large Language Models (LLMs) can act as a scaffolding agent for a robotic system that aims to learn to predict the effects of its actions. To this end, an object manipulation setup is considered where one object can be picked and placed on top of or in the vicinity of another object. The adopted LLM is asked to guide the action selection process through algorithmically generated state descriptions and action selection alternatives in natural language. The simulation experiments that include cubes in this setup show that LLM-guided (GPT3.5-guided) learning yields significantly faster discovery of novel structures compared to random exploration. However, we observed that GPT3.5 fails to effectively guide the robot in generating structures with different affordances such as cubes and spheres. Overall, we conclude that even without fine-tuning, LLMs may serve as a moderate scaffolding agent for improving robot learning, however, they still lack affordance understanding which limits the applicability of the current LLMs in robotic scaffolding tasks. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#289
-
Cenikj 2023
From language models to large-scale food and biomedical knowledge graphs
Knowledge about the interactions between dietary and biomedical factors is scattered throughout uncountable research articles in an unstructured form (e.g., text, images, etc.) and requires automatic structuring so that it can be provided to medical professionals in a suitable format. Various biomedical knowledge graphs exist, however, they require further extension with relations between food and biomedical entities. In this study, we evaluate the performance of three state-of-the-art relation-mining pipelines (FooDis, FoodChem and ChemDis) which extract relations between food, chemical and disease entities from textual data. We perform two case studies, where relations were automatically extracted by the pipelines and validated by domain experts. The results show that the pipelines can extract relations with an average precision around 70%, making new discoveries available to domain experts with reduced human effort, since the domain experts should only evaluate the results, instead of finding, and reading all new scientific papers. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3151
-
Chai 2024
RAVL: A Retrieval-Augmented Visual Language Model Framework for Knowledge-Based Visual Question Answering
Natural Language Processing and Chinese Computing: 13th National CCF Conference, NLPCC 2024, Hangzhou, China, November 1–3, 2024, Proceedings, Part III 2024;():394–406 Hangzhou, China Springer-Verlag 2024 DOI: 10.1007/978-981-97-9437-9_31 · Ref ID: 7142 |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#740
-
Chan 2021
SALKG: Learning From Knowledge Graph Explanations for Commonsense Reasoning
35th Annual Conference on Neural Information Processing Systems (NeurIPS) 2021;34(): Electr Network Neural Information Processing Systems (Nips) 2021 Ref ID: 3300 Augmenting pre-trained language models with knowledge graphs (KGs) has achieved success on various commonsense reasoning tasks. However, for a given task instance, the KG, or certain parts of the KG, may not be useful. Although KG-augmented models often use attention to focus on specific KG components, the KG is still always used, and the attention mechanism is never explicitly taught which KG components should be used. Meanwhile, saliency methods can measure how much a KG feature (e.g., graph, node, path) influences the model to make the correct prediction, thus explaining which KG features are useful. This paper explores how saliency explanations can be used to improve KG-augmented models' performance. First, we propose to create coarse (Is the KG useful?) and fine (Which nodes/paths in the KG are useful?) saliency explanations. Second, to motivate saliency-based supervision, we analyze oracle KG-augmented models which directly use saliency explanations as extra inputs for guiding their attention. Third, we propose SALKG, a framework for KG-augmented models to learn from coarse and/or fine saliency explanations. Given saliency explanations created from a task's training set, SALKG jointly trains the model to predict the explanations, then solve the task by attending to KG features highlighted by the predicted explanations. On three commonsense QA benchmarks (CSQA, OBQA, CODAH) and a range of KG-augmented models, we show that SALKG can yield considerable performance gains - up to 2.76% absolute improvement on CSQA. (2) |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3324
-
Chan 2022
DeepTrust: A Reliable Financial Knowledge Retrieval Framework For Explaining Extreme Pricing Anomalies
arXiv 2022;(): 2022 Ref ID: 7527 Extreme pricing anomalies may occur unexpectedly without a trivial cause, and equity traders typically experience a meticulous process to source disparate information and analyze its reliability before integrating it into the trusted knowledge base. We introduce DeepTrust, a reliable financial knowledge retrieval framework on Twitter to explain extreme price moves at speed, while ensuring data veracity using state-of-the-art NLP techniques. Our proposed framework consists of three modules, specialized for anomaly detection, information retrieval and reliability assessment. The workflow starts with identifying anomalous asset price changes using machine learning models trained with historical pricing data, and retrieving correlated unstructured data from Twitter using enhanced queries with dynamic search conditions. DeepTrust extrapolates information reliability from tweet features, traces of generative language model, argumentation structure, subjectivity and sentiment signals, and refine a concise collection of credible tweets for market insights. The framework is evaluated on two self-annotated financial anomalies, i.e., Twitter and Facebook stock price on 29 and 30 April 2021. The optimal setup outperforms the baseline classifier by 7.75% and 15.77% on F0.5-scores, and 10.55% and 18.88% on precision, respectively, proving its capability in screening unreliable information precisely. At the same time, information retrieval and reliability assessment modules are analyzed individually on their effectiveness and causes of limitations, with identified subjective and objective factors that influence the performance. As a collaborative project with Refinitiv, this framework paves a promising path towards building a scalable commercial solution that assists traders to reach investment decisions on pricing anomalies with authenticated knowledge from social media platforms in real-time. |
yuexi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#361
-
Chang 2021
Incorporating Domain Knowledge Into Language Models by Using Graph Convolutional Networks for Assessing Semantic Textual Similarity: Model Development and Performance Comparison
Background: Although electronic health record systems have facilitated clinical documentation in health care, they have also introduced new challenges, such as the proliferation of redundant information through the use of copy and paste commands or templates. One approach to trimming down bloated clinical documentation and improving clinical summarization is to identify highly similar text snippets with the goal of removing such text. Objective: We developed a natural language processing system for the task of assessing clinical semantic textual similarity. The system assigns scores to pairs of clinical text snippets based on their clinical semantic similarity. Methods: We leveraged recent advances in natural language processing and graph representation learning to create a model that combines linguistic and domain knowledge information from the MedSTS data set to assess clinical semantic textual similarity. We used bidirectional encoder representation from transformers (BERT)-based models as text encoders for the sentence pairs in the data set and graph convolutional networks (GCNs) as graph encoders for corresponding concept graphs that were constructed based on the sentences. We also explored techniques, including data augmentation, ensembling, and knowledge distillation, to improve the model's performance, as measured by the Pearson correlation coefficient (r). Results: Fine-tuning the BERT_base and ClinicalBERT models on the MedSTS data set provided a strong baseline (Pearson correlation coefficients: 0.842 and 0.848, respectively) compared to those of the previous year's submissions. Our data augmentation techniques yielded moderate gains in performance, and adding a GCN-based graph encoder to incorporate the concept graphs also boosted performance, especially when the node features were initialized with pretrained knowledge graph embeddings of the concepts (r=0.868). As expected, ensembling improved performance, and performing multisource ensembling by using different language model variants, conducting knowledge distillation with the multisource ensemble model, and taking a final ensemble of the distilled models further improved the system's performance (Pearson correlation coefficients: 0.875, 0.878, and 0.882, respectively). Conclusions: This study presents a system for the MedSTS clinical semantic textual similarity benchmark task, which was created by combining BERT-based text encoders and GCN-based graph encoders in order to incorporate domain knowledge into the natural language processing pipeline. We also experimented with other techniques involving data augmentation, pretrained concept embeddings, ensembling, and knowledge distillation to further increase our system's performance. Although the task and its benchmark data set are in the early stages of development, this study, as well as the results of the competition, demonstrates the potential of modern language model-based systems to detect redundant information in clinical notes. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3319
-
Chaudhary 2024
Decoding Intelligence: A Framework for Certifying Knowledge Comprehension in LLMs
arXiv 2024;(): 2024 Ref ID: 8138 Knowledge comprehension capability is an important aspect of human intelligence. As Large Language Models (LLMs) are being envisioned as superhuman agents, it is crucial for them to be proficient at knowledge comprehension. However, existing benchmarking studies do not provide consistent, generalizable, and formal guarantees on the knowledge comprehension capabilities of LLMs. In this work, we propose the first framework to certify knowledge comprehension in LLMs with formal probabilistic guarantees. Our certificates are quantitative – they consist of high-confidence, tight bounds on the probability that a target LLM gives the correct answer on any knowledge comprehension prompt sampled from a distribution. We design and certify novel specifications that precisely represent distributions of knowledge comprehension prompts leveraging knowledge graphs. We certify SOTA LLMs for specifications over the Wikidata5m knowledge graph. We find that the knowledge comprehension capability improves significantly with scaling the size of the models. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1375
-
Che 2024
A Hierarchical Context Augmentation Method to Improve Retrieval-Augmented LLMs on Scientific Papers
Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2024;():243-254 Association for Computing Machinery 2024 DOI: 10.1145/3637528.3671847 · Ref ID: 3957 Scientific papers of a large scale on the Internet encompass a wealth of data and knowledge, attracting the attention of numerous researchers. To fully utilize these knowledge, Retrieval-Augmented Large Language Models (LLMs) usually leverage large-scale scientific corpus to train and then retrieve relevant passages from external memory to improve generation, which have demonstrated outstanding performance. However, existing methods can only capture one-dimension fragmented textual information without incorporating hierarchical structural knowledge, eg. the deduction relationship of abstract and main body, which makes it difficult to grasp the central thought of papers. To tackle this problem, we propose a hierarchical context augmentation method, which helps Retrieval-Augmented LLMs to autoregressively learn the structure knowledge of scientific papers. Specifically, we utilize the document tree to represent the hierarchical relationship of a paper and enhance the structure information of scientific context from three aspects: scale, format and global information. First, we think each top-bottom path of document tree is a logical independent context, which can be used to largely increase the scale of extracted structural corpus. Second, we propose a novel label-based format to represent the structure of context in textual sequences, unified between training and inference. Third, we introduce the global information of retrieved passages to further enhance the structure of context. Extensive experiments on three scientific tasks show that the proposed method significantly improves the performance of Retrieval-Augmented LLMs on all tasks. Besides, our method achieves start-of-art performance in Question Answer task and outperforms ChatGPT. Moreover, it also brings considerate gains with irrelevant retrieval passages, illustrating its effectiveness on practical application scenarios. © 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM. |
Ishan
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2201
-
Chellagurki 2024
Biomedical Relation Extraction Using LLMs and Knowledge Graphs
2024 IEEE 10th International Conference on Big Data Computing Service and Machine Learning Applications (BigDataService) 2024;():60-69 2024 DOI: 10.1109/BigDataService62917.2024.00015 · Ref ID: 7068 Due to the rapid growth of research papers on biomedical topics, it has become increasingly important to make advancements in biomedical Natural Language Processing (NLP). Biomedical NLP enables us to extract important information from text, such as new insights into the role of different genes in disease susceptibility, or the potential for drug therapies that are effective against one disease to work effectively against another. In this paper, we present a comparative evaluation of the binary relation classification capabilities of the current state-of-the-art binary relation classifier, BioBERT, against recently released open-source large language models, Gemma-2b, Gemma-7b, and Llama2-7b, which we fine-tune with the benchmark GAD and EU-ADR datasets. In addition, we quantify the potential of discovering new relationships by utilizing knowledge graphs built out of known binary relations. |
Xinchen
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1417
-
Chen 2024
INSIDE: LLMS' INTERNAL STATES RETAIN THE POWER OF HALLUCINATION DETECTION
12th International Conference on Learning Representations, ICLR 2024 2024;(): International Conference on Learning Representations, ICLR 2024 Ref ID: 4689 Knowledge hallucination have raised widespread concerns for the security and reliability of deployed LLMs. Previous efforts in detecting hallucinations have been employed at logit-level uncertainty estimation or language-level self-consistency evaluation, where the semantic information is inevitably lost during the token-decoding procedure. Thus, we propose to explore the dense semantic information retained within LLMs' INternal States for hallucInation DEtection (INSIDE). In particular, a simple yet effective EigenScore metric is proposed to better evaluate responses' self-consistency, which exploits the eigenvalues of responses' covariance matrix to measure the semantic consistency/diversity in the dense embedding space. Furthermore, from the perspective of self-consistent hallucination detection, a test time feature clipping approach is explored to truncate extreme activations in the internal states, which reduces overconfident generations and potentially benefits the detection of overconfident hallucinations. Extensive experiments and ablation studies are performed on several popular LLMs and question-answering (QA) benchmarks, showing the effectiveness of our proposal. © 2024 12th International Conference on Learning Representations, ICLR 2024. All rights reserved. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1846
-
Chen 2024
SAC-KG: Exploiting Large Language Models as Skilled Automatic Constructors for Domain Knowledge Graphs
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;1():4345-4360 Association for Computational Linguistics (ACL) 2024 Ref ID: 4371 Knowledge graphs (KGs) play a pivotal role in knowledge-intensive tasks across specialized domains, where the acquisition of precise and dependable knowledge is crucial. However, existing KG construction methods heavily rely on human intervention to attain qualified KGs, which severely hinders the practical applicability in real-world scenarios. To address this challenge, we propose a general KG construction framework, named SAC-KG, to exploit large language models (LLMs) as Skilled Automatic Constructors for domain Knowledge Graph. SAC-KG effectively involves LLMs as domain experts to generate specialized and precise multi-level KGs. Specifically, SAC-KG consists of three components: Generator, Verifier, and Pruner. For a given entity, Generator produces its relations and tails from raw domain corpora, to construct a specialized single-level KG. Verifier and Pruner then work together to ensure precision by correcting generation errors and determining whether newly produced tails require further iteration for the next-level KG. Experiments demonstrate that SAC-KG automatically constructs a domain KG at the scale of over one million nodes and achieves a precision of 89.32%, leading to a superior performance with over 20% increase in precision rate compared to existing state-of-the-art methods for the KG construction task. © 2024 Association for Computational Linguistics. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3623
-
Chen 2023
Large Knowledge Model: Perspectives and Challenges
arXiv 2023;(): 2023 Ref ID: 7970 Humankind's understanding of the world is fundamentally linked to our perception and cognition, with \emph{human languages} serving as one of the major carriers of \emph{world knowledge}. In this vein, \emph{Large Language Models} (LLMs) like ChatGPT epitomize the pre-training of extensive, sequence-based world knowledge into neural networks, facilitating the processing and manipulation of this knowledge in a parametric space. This article explores large models through the lens of "knowledge". We initially investigate the role of symbolic knowledge such as Knowledge Graphs (KGs) in enhancing LLMs, covering aspects like knowledge-augmented language model, structure-inducing pre-training, knowledgeable prompts, structured CoT, knowledge editing, semantic tools for LLM and knowledgeable AI agents. Subsequently, we examine how LLMs can boost traditional symbolic knowledge bases, encompassing aspects like using LLM as KG builder and controller, structured knowledge pretraining, and LLM-enhanced symbolic reasoning. Considering the intricate nature of human knowledge, we advocate for the creation of \emph{Large Knowledge Models} (LKM), specifically engineered to manage diversified spectrum of knowledge structures. This promising undertaking would entail several key challenges, such as disentangling knowledge base from language models, cognitive alignment with human knowledge, integration of perception and cognition, and building large commonsense models for interacting with physical world, among others. We finally propose a five-"A" principle to distinguish the concept of LKM. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#141
-
Chen 2023
Contextual semantic embeddings for ontology subsumption prediction
Automating ontology construction and curation is an important but challenging task in knowledge engineering and artificial intelligence. Prediction by machine learning techniques such as contextual semantic embedding is a promising direction, but the relevant research is still preliminary especially for expressive ontologies in Web Ontology Language (OWL). In this paper, we present a new subsumption prediction method named BERTSubs for classes of OWL ontology. It exploits the pre-trained language model BERT to compute contextual embeddings of a class, where customized templates are proposed to incorporate the class context (e.g., neighbouring classes) and the logical existential restriction. BERTSubs is able to predict multiple kinds of subsumers including named classes from the same ontology or another ontology, and existential restrictions from the same ontology. Extensive evaluation on five real-world ontologies for three different subsumption tasks has shown the effectiveness of the templates and that BERTSubs can dramatically outperform the baselines that use (literal-aware) knowledge graph embeddings, non-contextual word embeddings and the state-of-the-art OWL ontology embeddings. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1609
-
Chen 2024
LLM-Assisted Multi-Teacher Continual Learning for Visual Question Answering in Robotic Surgery
Proceedings - IEEE International Conference on Robotics and Automation 2024;():10772-10778 Institute of Electrical and Electronics Engineers Inc. 2024 DOI: 10.1109/ICRA57147.2024.10610603 · Ref ID: 4475 Visual question answering (VQA) can be fundamentally crucial for promoting robotic-assisted surgical education. In practice, the needs of trainees are constantly evolving, such as learning more surgical types and adapting to new surgical instruments/techniques. Therefore, continually updating the VQA system by a sequential data stream from multiple resources is demanded in robotic surgery to address new tasks. In surgical scenarios, the privacy issue of patient data often restricts the availability of old data when updating the model, necessitating an exemplar-free continual learning (CL) setup. However, prior studies overlooked two vital problems of the surgical domain: i) large domain shifts from diverse surgical operations collected from multiple departments or clinical centers, and ii) severe data imbalance arising from the uneven presence of surgical instruments or activities during surgical procedures. This paper proposes to address these two problems with a multimodal large language model (LLM) and an adaptive weight assignment methodology. We first develop a new multi-teacher CL framework that leverages a multimodal LLM as the additional teacher. The strong generalization ability of the LLM can bridge the knowledge gap when domain shifts and data imbalances occur. We then put forth a novel data processing method that transforms complex LLM embeddings into logits compatible with our CL framework. We also design an adaptive weight assignment approach that balances the generalization ability of the LLM and the domain expertise of the old CL model. Finally, we construct a new dataset for surgical VQA tasks. Extensive experimental results demonstrate the superiority of our method to other advanced CL models. © 2024 IEEE. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#371
-
Chen 2024
Information Extraction of Aviation Accident Causation Knowledge Graph: An LLM-Based Approach
Summarizing the causation of aviation accidents is conducive to enhancing aviation safety. The knowledge graph of aviation accident causation, constructed based on aviation accident reports, can assist in analyzing the causes of aviation accidents. With the continuous development of artificial intelligence technology, leveraging large language models for information extraction and knowledge graph construction has demonstrated significant advantages. This paper proposes an information extraction method for aviation accident causation based on Claude-prompt, which relies on the large-scale pre-trained language model Claude 3.5. Through prompt engineering, combined with a few-shot learning strategy and a self-judgment mechanism, this method achieves automatic extraction of accident-cause entities and their relationships. Experimental results indicate that this approach effectively improves the accuracy of information extraction, overcoming the limitations of traditional methods in terms of accuracy and efficiency in processing complex texts. It provides strong support for subsequently constructing a structured knowledge graph of aviation accident causation and conducting causation analysis of aviation accidents. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1409
-
Chen 2021
Incorporating Domain Knowledge into Language Transformers for Multi-Label Classification of Chinese Medical Questions
ROCLING 2021 - Proceedings of the 33rd Conference on Computational Linguistics and Speech Processing 2021;():265-270 The Association for Computational Linguistics and Chinese Language Processing (ACLCLP) 2021 Ref ID: 5658 In this paper, we propose a knowledge infusion mechanism to incorporate domain knowledge into language transformers. Weakly supervised data is regarded as the main source for knowledge acquisition. We pre-train the language models to capture masked knowledge of focuses and aspects and then fine-tune them to obtain better performance on the downstream tasks. Due to the lack of publicly available datasets for multi-label classification of Chinese medical questions, we crawled questions from medical question/answer forums and manually annotated them using eight predefined classes: persons and organizations, symptom, cause, examination, disease, information, ingredient, and treatment. Finally, a total of 1,814 questions with 2,340 labels. Each question contains an average of 1.29 labels. We used Baidu Medical Encyclopedia as the knowledge resource. Two transformers BERT and RoBERTa were implemented to compare performance on our constructed datasets. Experimental results showed that our proposed model with knowledge infusion mechanism can achieve better performance, no matter which evaluation metric including Macro F1, Micro F1, Weighted F1 or Subset Accuracy were considered. © 2021 ROCLING 2021 - Proceedings of the 33rd Conference on Computational Linguistics and Speech Processing. All rights reserved. |
Xinchen
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3184
-
Chen 2024
Apollonion: Profile-centric Dialog Agent
arXiv 2024;(): 2024 Ref ID: 8230 The emergence of Large Language Models (LLMs) has innovated the development of dialog agents. Specially, a well-trained LLM, as a central process unit, is capable of providing fluent and reasonable response for user's request. Besides, auxiliary tools such as external knowledge retrieval, personalized character for vivid response, short/long-term memory for ultra long context management are developed, completing the usage experience for LLM-based dialog agents. However, the above-mentioned techniques does not solve the issue of \textbf{personalization from user perspective}: agents response in a same fashion to different users, without consideration of their features, such as habits, interests and past experience. In another words, current implementation of dialog agents fail in ``knowing the user''. The capacity of well-description and representation of user is under development. In this work, we proposed a framework for dialog agent to incorporate user profiling (initialization, update): user's query and response is analyzed and organized into a structural user profile, which is latter served to provide personal and more precise response. Besides, we proposed a series of evaluation protocols for personalization: to what extend the response is personal to the different users. The framework is named as \method{}, inspired by inscription of ``Know Yourself'' in the temple of Apollo (also known as \method{}) in Ancient Greek. Few works have been conducted on incorporating personalization into LLM, \method{} is a pioneer work on guiding LLM's response to meet individuation via the application of dialog agents, with a set of evaluation methods for measurement in personalization. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3392
-
Chen 2024
Entity Alignment with Noisy Annotations from Large Language Models
arXiv 2024;(): 2024 Ref ID: 8319 Entity alignment (EA) aims to merge two knowledge graphs (KGs) by identifying equivalent entity pairs. While existing methods heavily rely on human-generated labels, it is prohibitively expensive to incorporate cross-domain experts for annotation in real-world scenarios. The advent of Large Language Models (LLMs) presents new avenues for automating EA with annotations, inspired by their comprehensive capability to process semantic information. However, it is nontrivial to directly apply LLMs for EA since the annotation space in real-world KGs is large. LLMs could also generate noisy labels that may mislead the alignment. To this end, we propose a unified framework, LLM4EA, to effectively leverage LLMs for EA. Specifically, we design a novel active learning policy to significantly reduce the annotation space by prioritizing the most valuable entities based on the entire inter-KG and intra-KG structure. Moreover, we introduce an unsupervised label refiner to continuously enhance label accuracy through in-depth probabilistic reasoning. We iteratively optimize the policy based on the feedback from a base EA model. Extensive experiments demonstrate the advantages of LLM4EA on four benchmark datasets in terms of effectiveness, robustness, and efficiency. Codes are available via https://github.com/chensyCN/llm4ea_official. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#464
-
Chen 2008
Knowledge sharing in virtual enterprises via an ontology-based access control approach
Collaborating throughout a product life cycle via virtual enterprise (VE) is one of the most promising strategies for enhancing global competitiveness. Efficient and secure knowledge sharing is critical to the success of a VE. This study presents a novel approach, model and technology for knowledge access control and sharing across enterprises. First, this study proposes an ontology-based knowledge sharing model and a multiple-layer knowledge representation framework on which a knowledge access control model for knowledge sharing in a VE is proposed. In the proposed model, user authorizations permitting access to knowledge in a VE are classified into two levels: (1) basic privileges and (2) extended privileges. The former is evaluated from four dimensions, i.e. who, what, when and where, while the latter is determined by considering how three domain ontologies, i.e., product, organization and activity, are related. This study then develops a knowledge access control policy (KACP) language model which is used to identify the knowledge access control and sharing rules of a VE and all its enterprise members. The knowledge access control model proposed in this study can facilitate VE Knowledge management and sharing across enterprises, enhance knowledge sharing security and flexibility and regulate knowledge sharing to expeditiously reflect changes in the business environment. (c) 2007 Elsevier B.V. All rights reserved. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1071
-
Chen 2022
Chinese Machine Reading Comprehension Based on Language Model Containing Knowledge
ACM International Conference Proceeding Series 2022;(): Association for Computing Machinery 2022 DOI: 10.1145/3565387.3565405 · Ref ID: 5345 Machine reading comprehension (MRC) is a task that requires machines to answer relevant questions based on a given context. In recent years, it has attracted extensive attention with the development of deep learning and big data. Considering that human beings will associate some external relevant knowledge when understanding the text, researchers have proposed a method of introducing knowledge outside the given context to assist reading and this method is called Knowledge-Based Machine Reading Comprehension (KBMRC). However, the current research on this method is still scattered, and the retrieval and fusion of relevant knowledge are still two challenges in application, especially in Chinese MRC. The contribution of this paper mainly on the following three points: Firstly, in order to resolve the problem of related knowledge retrieval, we build up a related knowledge set. Secondly, in order to resolve the problem of related knowledge fusion, we propose a negative sample generation strategy and train a language model containing knowledge. Finally, a twin-tower fusion model is constructed based on this model. The experiments on Chinese reading comprehension dataset CMRC2018 show that our method has a certain improvement compared with the baseline method without external knowledge. © 2022 Association for Computing Machinery. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1783
-
Chen 2024
RareBench: Can LLMs Serve as Rare Diseases Specialists?
Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2024;():4850-4861 Association for Computing Machinery 2024 DOI: 10.1145/3637528.3671576 · Ref ID: 3993 Generalist Large Language Models (LLMs), such as GPT-4, have shown considerable promise in various domains, including medical diagnosis. Rare diseases, affecting approximately 300 million people worldwide, often have unsatisfactory clinical diagnosis rates primarily due to a lack of experienced physicians and the complexity of differentiating among many rare diseases. In this context, recent news such as "ChatGPT correctly diagnosed a 4-year-old's rare disease after 17 doctors failed"underscore LLMs' potential, yet underexplored, role in clinically diagnosing rare diseases. To bridge this research gap, we introduce RareBench, a pioneering benchmark designed to systematically evaluate the capabilities of LLMs on 4 critical dimensions within the realm of rare diseases. Meanwhile, we have compiled the largest open-source dataset on rare disease patients, establishing a benchmark for future studies in this domain. To facilitate differential diagnosis of rare diseases, we develop a dynamic few-shot prompt methodology, leveraging a comprehensive rare disease knowledge graph synthesized from multiple knowledge bases, significantly enhancing LLMs' diagnostic performance. Moreover, we present an exhaustive comparative study of GPT-4's diagnostic capabilities against those of specialist physicians. Our experimental findings underscore the promising potential of integrating LLMs into the clinical diagnostic process for rare diseases. This paves the way for exciting possibilities in future advancements in this field. © 2024 Copyright held by the owner/author(s). |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1823
-
Chen 2024
Retrieval-Augmented Knowledge Integration into Language Models: A Survey
KnowLLM 2024 - 1st Workshop on Towards Knowledgeable Language Models, Proceedings of the Workshop 2024;():45-63 Association for Computational Linguistics (ACL) 2024 Ref ID: 4251 This survey analyses how external knowledge can be integrated into language models in the context of retrieval-augmentation. The main goal of this work is to give an overview of: (1) Which external knowledge can be augmented? (2) Given a knowledge source, how to retrieve from it and then integrate the retrieved knowledge? To achieve this, we define and give a mathematical formulation of retrieval-augmented knowledge integration (RAKI). We discuss retrieval and integration techniques separately in detail, for each of the following knowledge formats: knowledge graph, tabular and natural language. © 2024 Association for Computational Linguistics. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1937
-
Chen 2025
Temporal Knowledge Graph Link Prediction Using Synergized Large Language Models and Temporal Knowledge Graphs
Communications in Computer and Information Science 2025;2183 CCIS():33-45 Springer Science and Business Media Deutschland GmbH 2025 DOI: 10.1007/978-981-97-7007-6_3 · Ref ID: 3875 Although large language models and temporal knowledge graphs each have significant advantages in the field of artificial intelligence, they also face certain challenges. However, through collaboration, large language models and temporal knowledge graphs can complement each other, addressing their respective shortcomings. This collaborative approach aims to harness the potential feasibility and practical effectiveness of large language models as external knowledge bases for temporal knowledge graph reasoning tasks. In our research, we have meticulously designed a synergized model that leverages the knowledge from the graph as prompts. The answers generated by the large language model undergo careful processing before being seamlessly incorporated into the training dataset. The ultimate goal is to significantly enhance the reasoning capabilities of temporal knowledge graphs. Experimental results underscore the positive impact of this synergized model on the completion tasks of temporal knowledge graphs, showcasing its potential to address gaps in knowledge and improve overall performance. While its influence on prediction tasks is relatively weak, the collaborative synergy demonstrates promising avenues for further exploration and development in the realm of AI research. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2024
-
Chen 2023
ViStruct: Visual Structural Knowledge Extraction via Curriculum Guided Code-Vision Representation
EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings 2023;():13342-13357 Association for Computational Linguistics (ACL) 2023 Ref ID: 4940 State-of-the-art vision-language models (VLMs) still have limited performance in structural knowledge extraction, such as relations between objects. In this work, we present ViStruct, a training framework to learn VLMs for effective visual structural knowledge extraction. Two novel designs are incorporated. First, we propose to leverage the inherent structure of programming language to depict visual structural information. This approach enables explicit and consistent representation of visual structural information of multiple granularities, such as concepts relations, and events, in a well-organized structured format. Second, we introduce curriculum-based learning for VLMs to pro gressively comprehend visual structures, from fundamental visual concepts to intricate event structures. Our intuition is that lower-level knowledge may contribute to complex visual structure understanding. Furthermore, we compile and release a collection of datasets tailored for visual structural knowledge extraction. We adopt a weakly-supervised approach to directly generate visual event structures from captions for ViStruct training capitalizing on abundant image-caption pairs from the web. In experiments, we evaluate ViStruct on visual structure prediction tasks demonstrating its effectiveness in improving the understanding of visual structures. The code is public at https://github.com/Yangyi-Chen/vi-struct. ©2023 Association for Computational Linguistics. |
Mike
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2001
-
Chen 2024
Uncertain Knowledge Graph Completion with Rule Mining
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2024;14883 LNCS():100-112 Springer Science and Business Media Deutschland GmbH 2024 DOI: 10.1007/978-981-97-7707-5_9 · Ref ID: 4245 To model the uncertainty within knowledge graphs (KGs), existing studies define uncertain knowledge graphs (UKGs), which assign a confidence score to each triple to measure its likelihood of being true and make more precisely downstream tasks such as reasoning and decision making possible. Since KGs usually suffer from the problem of incompleteness, methods of rule mining and reasoning for knowledge graph completion are extensively studied due to their excellent interpretability. However, previous methods are all conducted under deterministic scenarios, neglecting the uncertainty of knowledge, making them unable to be directly applied to UKGs. In this paper, we propose a new framework on uncertain knowledge graph completion with rule mining. The framework is composed of a rule mining model and a confidence prediction model. The rule mining model applies an encoder-decoder network transformer to take rule mining as a sequence-to-sequence task to generate rules. It models the uncertainty in UKGs and infer new triples by differentiable reasoning based on TensorLog with mined rules. The confidence prediction model uses a pre-trained language model to predict the triple confidence given the rules mined. Experiments show that our models significantly outperform various baselines in different evaluation metrics on link prediction and confidence prediction, respectively. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024. |
Mike
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3541
-
Chen 2024
Integrating Multi-Head Convolutional Encoders with Cross-Attention for Improved SPARQL Query Translation
arXiv 2024;(): 2024 Ref ID: 8558 The main task of the KGQA system (Knowledge Graph Question Answering) is to convert user input questions into query syntax (such as SPARQL). With the rise of modern popular encoders and decoders like Transformer and ConvS2S, many scholars have shifted the research direction of SPARQL generation to the Neural Machine Translation (NMT) architecture or the generative AI field of Text-to-SPARQL. In NMT-based QA systems, the system treats knowledge base query syntax as a language. It uses NMT-based translation models to translate natural language questions into query syntax. Scholars use popular architectures equipped with cross-attention, such as Transformer, ConvS2S, and BiLSTM, to train translation models for query syntax. To achieve better query results, this paper improved the ConvS2S encoder and added multi-head attention from the Transformer, proposing a Multi-Head Conv encoder (MHC encoder) based on the n-gram language model. The principle is to use convolutional layers to capture local hidden features in the input sequence with different receptive fields, using multi-head attention to calculate dependencies between them. Ultimately, we found that the translation model based on the Multi-Head Conv encoder achieved better performance than other encoders, obtaining 76.52% and 83.37% BLEU-1 (BiLingual Evaluation Understudy) on the QALD-9 and LC-QuAD-1.0 datasets, respectively. Additionally, in the end-to-end system experiments on the QALD-9 and LC-QuAD-1.0 datasets, we achieved leading results over other KGQA systems, with Macro F1-measures reaching 52% and 66%, respectively. Moreover, the experimental results show that with limited computational resources, if one possesses an excellent encoder-decoder architecture and cross-attention, experts and scholars can achieve outstanding performance equivalent to large pre-trained models using only general embeddings. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3649
-
Chen 2024
Leverage Knowledge Graph and Large Language Model for Law Article Recommendation: A Case Study of Chinese Criminal Law
arXiv 2024;(): 2024 Ref ID: 8669 Court efficiency is vital for social stability. However, in most countries around the world, the grassroots courts face case backlogs, with decisions relying heavily on judicial personnel's cognitive labor, lacking intelligent tools to improve efficiency. To address this issue, we propose an efficient law article recommendation approach utilizing a Knowledge Graph (KG) and a Large Language Model (LLM). Firstly, we propose a Case-Enhanced Law Article Knowledge Graph (CLAKG) as a database to store current law statutes, historical case information, and correspondence between law articles and historical cases. Additionally, we introduce an automated CLAKG construction method based on LLM. On this basis, we propose a closed-loop law article recommendation method. Finally, through a series of experiments using judgment documents from the website "China Judgements Online", we have improved the accuracy of law article recommendation in cases from 0.549 to 0.694, demonstrating that our proposed method significantly outperforms baseline approaches. |
Davis
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3601
-
Chen 2024
Knowledge Localization: Mission Not Accomplished? Enter Query Localization!
arXiv 2024;(): 2024 Ref ID: 8310 Large language models (LLMs) store extensive factual knowledge, but the mechanisms behind how they store and express this knowledge remain unclear. The Knowledge Neuron (KN) thesis is a prominent theory for explaining these mechanisms. This theory is based on the knowledge localization (KL) assumption, which suggests that a fact can be localized to a few knowledge storage units, namely knowledge neurons. However, this assumption may be overly strong regarding knowledge storage and neglects knowledge expression mechanisms. Thus, we re-examine the KL assumption and confirm the existence of facts that do not adhere to it from both statistical and knowledge modification perspectives. Furthermore, we propose the Query Localization (QL) assumption. (1) Query-KN Mapping: The localization results are associated with the query rather than the fact. (2) Dynamic KN Selection: The attention module contributes to the selection of KNs for answering a query. Based on this, we further propose the Consistency-Aware KN modification method, which improves the performance of knowledge modification. We conduct 39 sets of experiments, along with additional visualization experiments, to rigorously validate our conclusions. |
Davis
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1911
-
Chen 2024
Study on Entity Extraction Method for Pharmaceutical Instructions Based on Pretrained Models
J. Frontier. Comput. Sci. Technol. 2024;18(7):1911-1922 2024 DOI: 10.3778/j.issn.1673-9418.2304078 · Ref ID: 3925 The extraction of medical entities from drug instructions provides fundamental data for the intelligent retrieval of medication information and the construction of medical knowledge graphs, with remarkable research significance and practical value. However, the heterogeneity of medical entities in drug instructions for treating different diseases poses challenges in model training, which requires a large number of annotated samples. To address this issue, a“large model + small model”design approach is used in this research. Specifically, this research proposes a part-label named entity recognition model based on a pre-trained model, which first employs a pre-trained language model fine-tuned on a small number of samples to extract partial entities from drug instructions, and then utilizes a Transformer-based part-label model to further optimize the entity extraction results. The part-label model encodes the input text, identified partial entities, and entity labels using a planar lattice structure, extracts feature representations using Transformer, and predicts entity labels through a conditional random fields (CRF) layer. To reduce the need for annotated training data, a sample data augmentation method is proposed using entity masking strategy on labeled samples to train the part-label model. Experimental results validate the feasibility of the“large model + small model” approach in medical entity extraction, with precision (P), recall (R), and F1 score of 85.0%, 86.1%, and 85.6%, respectively, demonstrating superior performance compared with other learning methods. © 2024 Journal of Computer Engineering and Applications Beijing Co., Ltd.; Science Press. All rights reserved. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#281
-
Chen 2023
The First Workshop on Personalized Generative AI @ CIKM 2023: Personalization Meets Large Language Models
32nd ACM International Conference on Information and Knowledge Management (CIKM) 2023;():5267-5270 Birmingham, ENGLAND Assoc Computing Machinery 2023 DOI: 10.1145/3583780.3615314 · Ref ID: 3552 The First Workshop on Personalized Generative AI(1) aims to be a cornerstone event fostering innovation and collaboration in the dynamic field of personalized AI. Leveraging the potent capabilities of Large Language Models (LLMs) to enhance user experiences with tailored responses and recommendations, the workshop is designed to address a range of pressing challenges including knowledge gap bridging, hallucination mitigation, and efficiency optimization in handling extensive user profiles. As a nexus for academics and industry professionals, the event promises rich discussions on a plethora of topics such as the development and fine-tuning of foundational models, strategies for multi-modal personalization, and the imperative ethical and privacy considerations in LLM deployment. Through a curated series of keynote speeches, insightful panel discussions, and hands-on sessions, the workshop aspires to be a catalyst in the development of more precise, contextually relevant, and user-centric AI systems. It aims to foster a landscape where generative AI systems are not only responsive but also anticipatory of individual user needs, marking a significant stride in personalized experiences. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1934
-
Chen 2023
Tele-Knowledge Pre-training for Fault Analysis
Proceedings - International Conference on Data Engineering 2023;2023-April():3453-3466 IEEE Computer Society 2023 DOI: 10.1109/ICDE55515.2023.00265 · Ref ID: 5233 In this work, we share our experience on tele-knowledge pre-training for fault analysis, a crucial task in telecommunication applications that requires a wide range of knowledge normally found in both machine log data and product documents. To organize this knowledge from experts uniformly, we propose to create a Tele-KG (tele-knowledge graph). Using this valuable data, we further propose a tele-domain language pre-training model TeleBERT and its knowledge-enhanced version, a tele-knowledge re-training model KTeleBERT. which includes effective prompt hints, adaptive numerical data encoding, and two knowledge injection paradigms. Concretely, our proposal includes two stages: first, pre-training TeleBERT on 20 million tele-related corpora, and then re-training it on 1 million causal and machine-related corpora to obtain KTeleBERT. Our evaluation on multiple tasks related to fault analysis in tele-applications, including root-cause analysis, event association prediction, and fault chain tracing, shows that pretraining a language model with tele-domain data is beneficial for downstream tasks. Moreover, the KTeleBERT re-training further improves the performance of task models, highlighting the effectiveness of incorporating diverse tele-knowledge into the model. © 2023 IEEE. |
Mike
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3771
-
Chen 2024
The Power of Noise: Toward a Unified Multi-modal Knowledge Graph Representation Framework
arXiv 2024;(): 2024 Ref ID: 8173 The advancement of Multi-modal Pre-training highlights the necessity for a robust Multi-Modal Knowledge Graph (MMKG) representation learning framework. This framework is crucial for integrating structured knowledge into multi-modal Large Language Models (LLMs) at scale, aiming to alleviate issues like knowledge misconceptions and multi-modal hallucinations. In this work, to evaluate models' ability to accurately embed entities within MMKGs, we focus on two widely researched tasks: Multi-modal Knowledge Graph Completion (MKGC) and Multi-modal Entity Alignment (MMEA). Building on this foundation, we propose a novel SNAG method that utilizes a Transformer-based architecture equipped with modality-level noise masking for the robust integration of multi-modal entity features in KGs. By incorporating specific training objectives for both MKGC and MMEA, our approach achieves SOTA performance across a total of ten datasets (three for MKGC and seven for MEMA), demonstrating its robustness and versatility. Besides, SNAG can not only function as a standalone model but also enhance other existing methods, providing stable performance improvements. Our code and data are available at: https://github.com/zjukg/SNAG. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#197
-
Cheng 2024
Editing Language Model-Based Knowledge Graph Embeddings
38th AAAI Conference on Artificial Intelligence (AAAI) / 36th Conference on Innovative Applications of Artificial Intelligence / 14th Symposium on Educational Advances in Artificial Intelligence 2024;():17835-17843 Vancouver, CANADA Assoc Advancement Artificial Intelligence 2024 Ref ID: 2942 Recently decades have witnessed the empirical success of framing Knowledge Graph (KG) embeddings via language models. However, language model-based KG embeddings are usually deployed as static artifacts, making them difficult to modify post-deployment without re-training after deployment. To address this issue, we propose a new task of editing language model-based KG embeddings in this paper. This task is designed to facilitate rapid, data-efficient updates to KG embeddings without compromising the performance of other aspects. We build four new datasets: E-FB15k237, A-FB15k237, E-WN18RR, and A-WN18RR, and evaluate several knowledge editing baselines demonstrating the limited ability of previous models to handle the proposed challenging task. We further propose a simple yet strong baseline dubbed KGEditor, which utilizes additional parametric layers of the hypernetwork to edit/add facts. Our comprehensive experimental results reveal that KGEditor excels in updating specific facts without impacting the overall performance, even when faced with limited training resources. Code and datasets will be available at https://github.com/AnonymousForPapers/DeltaKG. |
yuexi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1047
-
Cheng 2024
Call Me When Necessary: LLMs can Efficiently and Faithfully Reason over Structured Environments
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():4275-4295 Association for Computational Linguistics (ACL) 2024 Ref ID: 4297 Large Language Models (LLMs) have shown potential in reasoning over structured environments, e.g., knowledge graphs and tables. Such tasks typically require multi-hop reasoning, i.e., match natural language utterance with instances in the environment. Previous works adopt LLMs to incrementally build a reasoning path, where LLMs either invoke tools or pick up items by step-by-step interacting with the environment. We propose Reasoning-Path-Editing (Readi), a novel framework where LLMs can efficiently and faithfully reason over structured environments. In Readi, LLMs initially generate a reasoning path given a query, and edit the path only when necessary. We instantiate the path on structured environments and provide feedback to edit the path if anything goes wrong. Experimental results on three KGQA and two TableQA datasets show the effectiveness of Readi, significantly surpassing previous LLM-based methods (by 9.1% Hit@1 on WebQSP, 12.4% on MQA-3H and 9.5% on WTQ), comparable with state-of-the-art fine-tuned methods (67% on CWQ and 74.7% on WebQSP) and substantially boosting the vanilla LLMs (by 14.9% on CWQ). Our code will be available on https://aka.ms/readi. © 2024 Association for Computational Linguistics. |
Xinchen
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1771
-
Chepurova 2024
Prompt Me One More Time: A Two-Step Knowledge Extraction Pipeline with Ontology-Based Verification
TextGraphs at ACL 2024 - Proceedings of TextGraphs-17: Graph-Based Methods for Natural Language Processing, 62nd Annual Meeting of the Association of Computational Linguistics 2024;():61-77 Association for Computational Linguistics (ACL) 2024 Ref ID: 4410 This study explores a method for extending real-world knowledge graphs (specifically, Wikidata) by extracting triplets from texts with the aid of Large Language Models (LLMs). We propose a two-step pipeline that includes the initial extraction of entity candidates, followed by their refinement and linkage to the canonical entities and relations of the knowledge graph. Finally, we utilize Wikidata relation constraints to select only verified triplets. We compare our approach to a model that was fine-tuned on a machine-generated dataset and demonstrate that it performs better on natural data. Our results suggest that LLM-based triplet extraction from texts, with subsequent verification, is a viable method for real-world applications. © 2024 Association for Computational Linguistics. |
yuexi
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2303
-
Cherniahovskaya 2006
Decision support in strategic control on the base of knowledge management
2006 IEEE International Technology Management Conference (ICE) 2006;():1-4 2006 DOI: 10.1109/ICE.2006.7477068 · Ref ID: 6218 The paper represents solution of the problem of strategic control decisions quality increasing on the base of knowledge management. The hypertext knowledge base for collaborative knowledge gathering, storing, management and presentation is developed. Objective-cognitive analysis methodology is presented for the hypertext knowledge base design. This methodology integrates methods of the objective analysis and design with the Unified Modeling Language, semantic analysis and ontology analysis of domain. The algorithm of the case based reasoning for the decision support is presented. There is also shown the sample of application of the intelligent decision support system in education process. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1598
-
Chernomorchenko 2024
Leveraging Taxonomic Information from Large Language Models for Hyponymy Prediction
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2024;14486 LNCS():49-63 Springer Science and Business Media Deutschland GmbH 2024 DOI: 10.1007/978-3-031-54534-4_4 · Ref ID: 4614 Pre-trained language models contain a vast amount of linguistic information as well as knowledge about the structure of the world. Both of these attributes are extremely beneficial for automatic enrichment of semantic graphs, such as knowledge bases and lexical-semantic databases. In this article, we employ generative language models to predict descendants of existing nodes in lexical data structures based on IS-A relations, such as WordNet. To accomplish this, we conduct experiments utilizing diverse formats of artificial text input containing information from lexical taxonomy for the English and Russian languages. Our findings demonstrate that the incorporation of data from the knowledge graph into a text input significantly affects the quality of hyponym prediction. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#551
-
Chi 2024
Maximizing the Social Welfare of Decentralized Knowledge Inference through Evolutionary Game
IEEE Conference on Computer Communications (IEEE INFOCOM) 2024;(): Vancouver, CANADA Ieee 2024 DOI: 10.1109/infocomwkshps61880.2024.10620663 · Ref ID: 3481 To broaden their domain knowledge coverage, large language models (LLMs) increasingly incorporate extensive corpus data from various industries. These heterogeneous datasets are often maintained by different stakeholders, where issues of data heterogeneity, privacy, and the network cost of data transmission have attracted much attention. To address these challenges, researchers have studied the integration of LLMs with knowledge graphs to manage data heterogeneity and with edge computing to ensure data privacy and transmission efficiency. In this work, we introduce a reputation system and a spot-check mechanism for a decentralized knowledge inference system in which edge nodes can collaborate with others for knowledge sharing while preserving their data privacy. We then use an evolutionary game model to study the dynamic decision-making between requestors and workers. Moreover, we show that higher reward values and higher model quality accelerate the maximization of social welfare. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#426
-
Chis 2024
A Knowledge Graph Approach to Cyber Threat Mitigation Derived from Data Flow Diagrams
International Conference on Automation, Quality and Testing, Robotics (AQTR) 2024;():71-76 Cluj-Napoca, ROMANIA Ieee 2024 DOI: 10.1109/aqtr61889.2024.10554074 · Ref ID: 3002 Data Flow Diagrams (DFD) have proven effective in designing and analyzing the flow of data in enterprise systems. They serve as indispensable tools for enterprises that are undergoing transition to cloud services. DFDs aid in understanding the current processes, identifying interfaces and integration points that require security measures. This paper reports a Design Science project to mitigate the cyber security threats at the design phase of a system and to perform auditing of an existing system through knowledge graphs. The proposal leverages knowledge gathered from various sources in a knowledge graph to identify semantic relationships and patterns, enabling automated inference, analysis and detection of vulnerability patterns. Furthermore, LLM-based (large language models) capabilities transform data management details captured as Data Flow Diagrams (DFD) into knowledge graphs for semantic querying and improved decision support. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2520
-
Chishti 2014
A grounding of business process modeling based on temporal logic
International Conference on Information Society (i-Society 2014) 2014;():266-273 2014 DOI: 10.1109/i-Society.2014.7009058 · Ref ID: 6130 This paper proposes grounding for the business process modeling (BPM) based on general time theory providing axiomatic system. First order logic is used to give a clear definition of abstract business process and corresponding temporal relations including derived relations using a single “Meets” relation. Temporal logic used here treats time interval and time points on equal footing. We use model theoretic approach, in which abstract business process is represented as a formal system and mapped to an instance/concrete realization. Also, we used resolution theorem to provide its soundness and completeness properties. A Process temporal graph as a directed graph is introduced with graphical notation defined to represent the temporal knowledge. A real world realization of the corresponding graph is considered an instance of an abstract business process. Sound and completeness properties of the process temporal graph using reachability analysis. However, Arcs representing time elements, vertex representing the `Meets' relation and also allows expression of both logical AND and OR. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3131
-
Cho 2024
FISHNET: Financial Intelligence from Sub-querying, Harmonizing, Neural-Conditioning, Expert Swarms, and Task Planning
Proceedings of the 5th ACM International Conference on AI in Finance 2024;():591–599 Brooklyn, NY, USA Association for Computing Machinery 2024 DOI: 10.1145/3677052.3698597 · Ref ID: 7291 |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#317
-
Cho 2023
Grammatical illusions in BERT: Attraction effects of subject-verb agreement and reflexive-antecedent dependencies
Cho, Ye-eun. 2023. Grammatical illusions in BERT: Attraction effects of subject-verb agreement and reflexive-antecedent dependencies. Linguistic Research 40(2): 317-352. The phenomenon of attraction effects, whereby a verb erroneously retrieves a syntactically inaccessible but feature-matching noun, is a type of grammatical illusions (Phillips, Wagers, and Lau 2011) that can occur in long-distance subject-verb agreement in human sentence processing (Wagers et al. 2009). In contrast, reflexive-antecedent dependencies have been claimed to lack attraction effects when the reflexive and the antecedent mismatch (Dillon et al. 2013). Yet, some other studies have shown that attraction effects have been observed in reflexive-antecedent dependencies, when the number of feature mismatch between the reflexive and the antecedent increases (Parker and Philips 2017). These findings suggest that there are different cue weightings based on the predictability of the dependency, and these cues are combined according to different cue-combination scheme, such as a linear or a non-linear cue-combination rule (Parker 2019). These linguistic phenomena can be used to analyze how linguistic features are accessed and combined within the internal states of Deep Neural Network (DNN) language models. In the linguistic representations of BERT (Devlin et al. 2018), one of the pre-trained DNN language models, various types of linguistic information are encoded in each layer (Jawahar et al. 2019) and combined while passing through the layers. By measuring the performance of Masked Language Model (MLM), this study finds that both subject-verb agreement and reflexive-antecedent dependencies show attraction effects and follow the linear-combinatoric rule in BERT. The different results from human sentence processing suggest that the self-attention mechanism of BERT may not be able to capture the differences in the predictability of the dependency as effectively as memory retrieval mechanisms in humans. These findings have important implications for developing more understandable and interpretable explainable-AI (xAI) systems that better capture the complexities of human language processing. (Sungkyunkwan University) |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#561
-
Choi 2021
MEM-KGC: Masked Entity Model for Knowledge Graph Completion With Pre-Trained Language Model
The knowledge graph completion (KGC) task aims to predict missing links in knowledge graphs. Recently, several KGC models based on translational distance or semantic matching methods have been proposed and have achieved meaningful results. However, existing models have a significant shortcoming-they cannot train entity embedding when an entity does not appear in the training phase. As a result, such models use randomly initialized embeddings for entities that are unseen in the training phase and cause a critical decrease in performance during the test phase. To solve this problem, we propose a new approach that performs KGC task by utilizing the masked language model (MLM) that is used for a pre-trained language model. Given a triple (head entity, relation, tail entity), we mask the tail entity and consider the head entity and the relation as a context for the tail entity. The model then predicts the masked entity from among all entities. Then, the task is conducted by the same process as an MLM, which predicts a masked token with a given context of tokens. Our experimental results show that the proposed model achieves significantly improved performances when unseen entities appear during the test phase and achieves state-of-the-art performance on the WN18RR dataset. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#441
-
Choi 2023
Knowledge graph extension with a pre-trained language model via unified learning method
Knowledge graphs (KGs) are collections of real-world knowledge that is represented by a structured form of triples. Since they are manually built in their nascent stage, there is a common problem that some links (triples) are missing. Knowledge graph completion (KGC) aims to find those missing links and thereby complete the KGs. However, as knowledge increases through diverse sources, new entities have explosively emerged and they are needed to be connected to existing KGs. Thus, open-world KGC is targeted on extending KGs to those new entities. Dealing with those new entities is challenging because they do not have any connection with entities in the existing KGs. One way to handle the new ones is to embed them with their textual descriptions with pre-trained word embeddings and score them in the graph-vector space with the existing typical KGC models. These models have resulted in meaningful results but there is still a lack of studies on utilizing the latest neural networks, such as pre-trained language models which are known to be better at capturing contexts than pre-trained word embeddings. This paper proposes a novel model that effectively connects new entities and existing KGs through a pre-trained language model. To effectively handle the problem, we utilize two learning methods; one is the classification method of the masked language model (MLM) that predicts a word among a huge vocabulary set with a given context, and the other is multi-task learning based on the Multi-Task for Deep Neural Networks (MT-DNN). Based on the methods, the model first generates an embedding of a new entity using its textual description and then uses the embedding to find one of the existing entities from a KG where the new entity can be connected. The experimental results on three benchmark datasets, DBPedia50k, FB15k-237-OWE, and FB20k, show that the proposed model improves performances by 9.2%p, 4.4%p, and 11.1%p, respectively, and achieves new state-of-the-art performance for all datasets. (c) 2022 Elsevier B.V. All rights reserved. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#14
-
Choi 2023
ALBERT with Knowledge Graph Encoder Utilizing Semantic Similarity for Commonsense Question Answering
Recently, pre-trained language representation models such as bidirec-tional encoder representations from transformers (BERT) have been performing well in commonsense question answering (CSQA). However, there is a problem that the models do not directly use explicit information of knowledge sources existing outside. To augment this, additional methods such as knowledge-aware graph network (KagNet) and multi-hop graph relation network (MHGRN) have been proposed. In this study, we propose to use the latest pre-trained language model a lite bidirectional encoder representations from transformers (ALBERT) with knowledge graph information extraction technique. We also propose to applying the novel method, schema graph expansion to recent language models. Then, we analyze the effect of applying knowledge graph-based knowledge extraction techniques to recent pre-trained language models and confirm that schema graph expansion is effective in some extent. Furthermore, we show that our proposed model can achieve better performance than existing KagNet and MHGRN models in CommonsenseQA dataset. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3800
-
Choi 2024
Questioning Internal Knowledge Structure of Large Language Models Through the Lens of the Olympic Games
arXiv 2024;(): 2024 Ref ID: 8588 Large language models (LLMs) have become a dominant approach in natural language processing, yet their internal knowledge structures remain largely unexplored. In this paper, we analyze the internal knowledge structures of LLMs using historical medal tallies from the Olympic Games. We task the models with providing the medal counts for each team and identifying which teams achieved specific rankings. Our results reveal that while state-of-the-art LLMs perform remarkably well in reporting medal counts for individual teams, they struggle significantly with questions about specific rankings. This suggests that the internal knowledge structures of LLMs are fundamentally different from those of humans, who can easily infer rankings from known medal counts. To support further research, we publicly release our code, dataset, and model outputs. |
yuexi
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1237
-
Choi 2024
Embodied CoT Distillation From LLM To Off-the-shelf Agents
Proceedings of Machine Learning Research 2024;235():8702-8721 ML Research Press 2024 Ref ID: 4359 We address the challenge of utilizing large language models (LLMs) for complex embodied tasks, in the environment where decision-making systems operate timely on capacity-limited, off-the-shelf devices. We present DEDER, a framework for decomposing and distilling the embodied reasoning capabilities from LLMs to efficient, small language model (sLM)-based policies. In DEDER, the decision-making process of LLM-based strategies is restructured into a hierarchy with a reasoning-policy and planning-policy. The reasoning-policy is distilled from the data that is generated through the embodied in-context learning and self-verification of an LLM, so it can produce effective rationales. The planning-policy, guided by the rationales, can render optimized plans efficiently. In turn, DEDER allows for adopting sLMs for both policies, deployed on off-the-shelf devices. Furthermore, to enhance the quality of intermediate rationales, specific to embodied tasks, we devise the embodied knowledge graph, and to generate multiple rationales timely through a single inference, we also use the contrastively prompted attention model. Our experiments with the ALFRED benchmark demonstrate that DEDER surpasses leading language planning and distillation approaches, indicating the applicability and efficiency of sLM-based embodied policies derived through DEDER. Copyright 2024 by the author(s) |
Ishan
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3338
-
Choubey 2024
Distill-SynthKG: Distilling Knowledge Graph Synthesis Workflow for Improved Coverage and Efficiency
arXiv 2024;(): 2024 Ref ID: 8740 Knowledge graphs (KGs) generated by large language models (LLMs) are becoming increasingly valuable for Retrieval-Augmented Generation (RAG) applications that require knowledge-intensive reasoning. However, existing KG extraction methods predominantly rely on prompt-based approaches, which are inefficient for processing large-scale corpora. These approaches often suffer from information loss, particularly with long documents, due to the lack of specialized design for KG construction. Additionally, there is a gap in evaluation datasets and methodologies for ontology-free KG construction. To overcome these limitations, we propose SynthKG, a multi-step, document-level ontology-free KG synthesis workflow based on LLMs. By fine-tuning a smaller LLM on the synthesized document-KG pairs, we streamline the multi-step process into a single-step KG generation approach called Distill-SynthKG, substantially reducing the number of LLM inference calls. Furthermore, we re-purpose existing question-answering datasets to establish KG evaluation datasets and introduce new evaluation metrics. Using KGs produced by Distill-SynthKG, we also design a novel graph-based retrieval framework for RAG. Experimental results demonstrate that Distill-SynthKG not only surpasses all baseline models in KG quality – including models up to eight times larger – but also consistently excels in retrieval and question-answering tasks. Our proposed graph retrieval framework also outperforms all KG-retrieval methods across multiple benchmark datasets. We release the SynthKG dataset and Distill-SynthKG model publicly to support further research and development. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3422
-
Chuang 2024
FaithLM: Towards Faithful Explanations for Large Language Models
arXiv 2024;(): 2024 Ref ID: 8080 Large Language Models (LLMs) have become proficient in addressing complex tasks by leveraging their extensive internal knowledge and reasoning capabilities. However, the black-box nature of these models complicates the task of explaining their decision-making processes. While recent advancements demonstrate the potential of leveraging LLMs to self-explain their predictions through natural language (NL) explanations, their explanations may not accurately reflect the LLMs' decision-making process due to a lack of fidelity optimization on the derived explanations. Measuring the fidelity of NL explanations is a challenging issue, as it is difficult to manipulate the input context to mask the semantics of these explanations. To this end, we introduce FaithLM to explain the decision of LLMs with NL explanations. Specifically, FaithLM designs a method for evaluating the fidelity of NL explanations by incorporating the contrary explanations to the query process. Moreover, FaithLM conducts an iterative process to improve the fidelity of derived explanations. Experiment results on three datasets from multiple domains demonstrate that FaithLM can significantly improve the fidelity of derived explanations, which also provides a better alignment with the ground-truth explanations. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1342
-
Colas 2022
GAP: A Graph-aware Language Model Framework for Knowledge Graph-to-Text Generation
Proceedings - International Conference on Computational Linguistics, COLING 2022;29():5755-5769 Association for Computational Linguistics (ACL) 2022 Ref ID: 5306 Recent improvements in KG-to-text generation are due to additional auxiliary pre-training tasks designed to give the fine-tune task a boost in performance. These tasks require extensive computational resources while only suggesting marginal improvements. Here, we demonstrate that by fusing graph-aware elements into existing pre-trained language models, we are able to outperform state-of-the-art models and close the gap imposed by additional pre-training tasks. We do so by proposing a mask structure to capture neighborhood information and a novel type encoder that adds a bias to the graph-attention weights depending on the connection type. Experiments on two KG-to-text benchmark datasets show our models are competitive while involving fewer parameters and no additional pre-training tasks. By formulating the problem as a framework, we can interchange the various proposed components and begin interpreting KG-to-text generative models based on the topological and type information found in a graph. © 2022 Proceedings - International Conference on Computational Linguistics, COLING. All rights reserved. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3651
-
Colombo 2024
Leveraging Knowledge Graphs and LLMs to Support and Monitor Legislative Systems
arXiv 2024;(): 2024 Ref ID: 8612 Knowledge Graphs (KGs) have been used to organize large datasets into structured, interconnected information, enhancing data analytics across various fields. In the legislative context, one potential natural application of KGs is modeling the intricate set of interconnections that link laws and their articles with each other and the broader legislative context. At the same time, the rise of large language models (LLMs) such as GPT has opened new opportunities in legal applications, such as text generation and document drafting. Despite their potential, the use of LLMs in legislative contexts is critical since it requires the absence of hallucinations and reliance on up-to-date information, as new laws are published on a daily basis. This work investigates how Legislative Knowledge Graphs and LLMs can synergize and support legislative processes. We address three key questions: the benefits of using KGs for legislative systems, how LLM can support legislative activities by ensuring an accurate output, and how we can allow non-technical users to use such technologies in their activities. To this aim, we develop Legis AI Platform, an interactive platform focused on Italian legislation that enhances the possibility of conducting legislative analysis and that aims to support lawmaking activities. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3270
-
Colon-Hernandez 2021
Combining pre-trained language models and structured knowledge
arXiv 2021;(): 2021 Ref ID: 7436 In recent years, transformer-based language models have achieved state of the art performance in various NLP benchmarks. These models are able to extract mostly distributional information with some semantics from unstructured text, however it has proven challenging to integrate structured information, such as knowledge graphs into these models. We examine a variety of approaches to integrate structured knowledge into current language models and determine challenges, and possible opportunities to leverage both structured and unstructured information sources. From our survey, we find that there are still opportunities at exploiting adapter-based injections and that it may be possible to further combine various of the explored approaches into one system. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3179
-
Colon-Hernandez 2023
Adversarial Transformer Language Models for Contextual Commonsense Inference
arXiv 2023;(): 2023 Ref ID: 7647 Contextualized or discourse aware commonsense inference is the task of generating coherent commonsense assertions (i.e., facts) from a given story, and a particular sentence from that story. Some problems with the task are: lack of controllability for topics of the inferred facts; lack of commonsense knowledge during training; and, possibly, hallucinated or false facts. In this work, we utilize a transformer model for this task and develop techniques to address the aforementioned problems in the task. We control the inference by introducing a new technique we call "hinting". Hinting is a kind of language model prompting, that utilizes both hard prompts (specific words) and soft prompts (virtual learnable templates). This serves as a control signal to advise the language model "what to talk about". Next, we establish a methodology for performing joint inference with multiple commonsense knowledge bases. Joint inference of commonsense requires care, because it is imprecise and the level of generality is more flexible. You want to be sure that the results "still make sense" for the context. To this end, we align the textual version of assertions from three knowledge graphs (ConceptNet, ATOMIC2020, and GLUCOSE) with a story and a target sentence. This combination allows us to train a single model to perform joint inference with multiple knowledge graphs. We show experimental results for the three knowledge graphs on joint inference. Our final contribution is exploring a GAN architecture that generates the contextualized commonsense assertions and scores them as to their plausibility through a discriminator. The result is an integrated system for contextual commonsense inference in stories, that can controllably generate plausible commonsense assertions, and takes advantage of joint inference between multiple commonsense knowledge bases. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2026
-
Corrado 2023
VO.I.C.E. FIRST: Supporting Human Assistants with Real-Time Voice Understanding
2023 IEEE International Conference on Metrology for eXtended Reality, Artificial Intelligence and Neural Engineering, MetroXRAINE 2023 - Proceedings 2023;():1104-1109 Institute of Electrical and Electronics Engineers Inc. 2023 DOI: 10.1109/MetroXRAINE58569.2023.10405568 · Ref ID: 4965 While AI and automation have made significant strides in customer support, there are still situations where human intervention via voice channels is necessary to provide the best possible customer experience. In fact, although AI and chatbots have become increasingly sophisticated, they may not always be able to handle complex or nuanced customer issues. Human agents can better understand and respond to these situations, providing tailored solutions. At the same time, solving non-trivial customer problems often requires access to knowledge bases and contextual customer information, for which AI is particularly well suited. Hence the idea of integrating human and artificial intelligence in a hybrid solution. We developed an AI system to help human assistants in the process of handling conversations. This system can be viewed as a collaborative bot (cobot). The cobot captures the audio stream of the conversation, converts it to text and analyzes it in real time. The extracted tokens are classified and sent to a reasoning system based on a knowledge graph, that provides information and action suggestions to the human assistant. Assistants are also capable of providing information to the reasoning system, utilizing their human understanding of the client's circumstances as they unfold. While designing a prototypical solution for utility services, we have faced the problem of real-time use of computationally complex procedures, including spontaneous speech understanding and knowledge-based heuristic rules. Moreover, we adopted a standards-based approach and experimented with open source reasoners and publicly available language models. The paper outlines the system architecture and design, and discusses the results of the first experiments. © 2023 IEEE. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#37
-
Corrado 2023
Assisting the Assistant: A Cobot for Voice Customer Support
2nd International Conference on Hybrid Human-Artificial Intelligence (HHAI) 2023;368():330-339 Munich, GERMANY Ios Press 2023 DOI: 10.3233/faia230096 · Ref ID: 3445 Despite recent advances in automation, customer support still requires a substantial amount of human intervention through voice channels. With the aim of improving the work of human assistants, we developed a collaborative bot (cobot) to help them in the process of handling customer voice interactions. The cobot is a reasoning agent that starts from loading background customer data into a dynamic knowledge graph. Then it captures the audio stream of the conversation, converts it to text in real time, analyzes the blocks of conversation with neural technologies and "thinks" about the results. Assistants can also supply data to the cobot, based on the information they gather from the ongoing conversation. The reasoning agent provides information and action suggestions to the human assistant by applying heuristics on data collected from both automatic and human sources, based on a task and domain-specific conceptual models (ontologies). While designing a prototypical solution for utility services in Italy, we are faced with many problems, including spontaneous speech understanding, factual and linguistic knowledge representation, and efficient heuristic reasoning. We adopted a standards-based approach and experimented with open source reasoners and publicly available language models. The paper presents preliminary findings and outlines the system design, with focus on the interplay of neural language processing and logic reasoning. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3355
-
D'Abramo 2024
Dynamic Few-Shot Learning for Knowledge Graph Question Answering
arXiv 2024;(): 2024 Ref ID: 8439 Large language models present opportunities for innovative Question Answering over Knowledge Graphs (KGQA). However, they are not inherently designed for query generation. To bridge this gap, solutions have been proposed that rely on fine-tuning or ad-hoc architectures, achieving good results but limited out-of-domain distribution generalization. In this study, we introduce a novel approach called Dynamic Few-Shot Learning (DFSL). DFSL integrates the efficiency of in-context learning and semantic similarity and provides a generally applicable solution for KGQA with state-of-the-art performance. We run an extensive evaluation across multiple benchmark datasets and architecture configurations. |
yuexi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1174
-
D’Aragona 2024
Design of a Knowledge Hub of Heterogeneous Multisource Documents to support Public Authorities
CEUR Workshop Proceedings 2024;3762():430-435 CEUR-WS 2024 Ref ID: 4200 This contribution outlines the design of a Knowledge Hub of heterogeneous documents related to the Mediterranean Action Plan UNEP-MAP of the United Nations Environment Program [1]. The Knowledge Hub is intended to serve as a resource to assist public authorities and users with different backgrounds and needs in accessing information efficiently. Users can either formulate natural language queries or navigate a knowledge graph automatically generated to find relevant documents. The Knowledge Hub is designed based on state-of-the-art Large Language Models. (LLMs) A user-evaluation experiment was conducted, testing publicly available models on a subset of documents using distinct LLMs settings. This step was aimed to identify the best-performing model for further using it to classify the documents with respect to the topics of interest. © 2024 Copyright for this paper by its authors. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1285
-
D’Souza 2023
Evaluating Prompt-Based Question Answering for Object Prediction in the Open Research Knowledge Graph
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2023;14146 LNCS():508-515 Springer Science and Business Media Deutschland GmbH 2023 DOI: 10.1007/978-3-031-39847-6_40 · Ref ID: 5201 Recent investigations have explored prompt-based training of transformer language models for new text genres in low-resource settings. This approach has proven effective in transferring pre-trained or fine-tuned models to resource-scarce environments. This work presents the first results on applying prompt-based training to transformers for scholarly knowledge graph object prediction. Methodologically, it stands out in two main ways: 1) it deviates from previous studies that propose entity and relation extraction pipelines, and 2) it tests the method in a significantly different domain, scholarly knowledge, evaluating linguistic, probabilistic, and factual generalizability of large-scale transformer models. Our findings demonstrate that: i) out-of-the-box transformer models underperform on the new scholarly domain, ii) prompt-based training improves performance by up to 40% in relaxed evaluation, and iii) tests of the models in a distinct domain reveals a gap in capturing domain knowledge, highlighting the need for increased attention and resources in the scholarly domain for transformer models. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3182
-
Da 2021
Analyzing Commonsense Emergence in Few-shot Knowledge Models
arXiv 2021;(): 2021 Ref ID: 7434 Recently, commonsense knowledge models - pretrained language models (LM) fine-tuned on knowledge graph (KG) tuples - showed that considerable amounts of commonsense knowledge can be encoded in the parameters of large language models. However, as parallel studies show that LMs are poor hypothesizers of declarative commonsense relationships on their own, it remains unclear whether this knowledge is learned during pretraining or from fine-tuning on KG examples. To investigate this question, we train commonsense knowledge models in few-shot settings to study the emergence of their commonsense representation abilities. Our results show that commonsense knowledge models can rapidly adapt from limited examples, indicating that KG fine-tuning serves to learn an interface to encoded knowledge learned during pretraining. Importantly, our analysis of absolute, angular, and distributional parameter changes during few-shot fine-tuning provides novel insights into how this interface is learned. |
Xinchen
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#316
-
Dai 2025
A GPT-assisted iterative method for extracting domain knowledge from a large volume of literature of electromagnetic wave absorbing materials with limited manually annotated data
Research on electromagnetic wave absorbing materials is an important part of materials science. Each year, a substantial amount of academic literature is published in this field, containing critical information. Rapid and effective knowledge extraction from these documents is key to accelerating field development, and automated knowledge extraction based on deep learning provides a solution to this challenge. However, deep learning models typically require extensive annotated data for training, which is time-consuming and expensive to obtain in highly specialized subfields. To address this issue, this paper presents a GPT-assisted iterative training method that uses only 30 manually annotated literature abstracts as a training set and ultimately achieves an F1 score of 82.94% for a named entity recognition model (NER). The effectiveness of this model is demonstrated by comparing with other large language models commonly used in materials science on a custom dataset. We constructed a knowledge extraction framework centered around the obtained NER model and collected literature on electromagnetic wave absorbing materials from the last decade. The extraction and application results demonstrate the practicality of our framework. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#461
-
Dai 2022
Knowledge Neurons in Pretrained Transformers
60th Annual Meeting of the Association-for-Computational-Linguistics (ACL) 2022;():8493-8502 Dublin, IRELAND Assoc Computational Linguistics-Acl 2022 Ref ID: 3578 Large-scale pretrained language models are surprisingly good at recalling factual knowledge presented in the training corpus (Petroni et al., 2019; Jiang et al., 2020b). In this paper, we present preliminary studies on how factual knowledge is stored in pretrained Transformers by introducing the concept of knowledge neurons. Specifically, we examine the fill-in-the-blank cloze task for BERT. Given a relational fact, we propose a knowledge attribution method to identify the neurons that express the fact. We find that the activation of such knowledge neurons is positively correlated to the expression of their corresponding facts. In our case studies, we attempt to leverage knowledge neurons to edit (such as update, and erase) specific factual knowledge without fine-tuning. Our results shed light on understanding the storage of knowledge within pretrained Transformers. The code is available at https://github.com/Hunter-DDM/knowledge-neurons. |
yuexi
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1491
-
Dalal 2021
Knowledge augmented language models for causal question answering
CEUR Workshop Proceedings 2021;3005():17-24 CEUR-WS 2021 Ref ID: 5694 The task of causal question answering broadly involves reasoning about causal relations and causality over a provided premise. Causal question answering can be expressed across a variety of tasks including commonsense question answering, procedural reasoning, reading comprehension, and abductive reasoning. Transformer-based pretrained language models have shown great promise across many natural language processing (NLP) applications. However, these models are reliant on distributional knowledge learned during the pretraining process and are limited in their causal reasoning capabilities. Causal knowledge, often represented as cause-effect triples in a knowledge graph, can be used to augment and improve the causal reasoning capabilities of language models. There is limited work exploring the efficacy of causal knowledge for question answering tasks. We consider the challenge of structuring causal knowledge in language models and developing a unified model that can solve a broad set of causal question answering tasks. Copyright © 2021 for this paper by its authors. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3870
-
Dani 2024
SemioLLM: Assessing Large Language Models for Semiological Analysis in Epilepsy Research
arXiv 2024;(): 2024 Ref ID: 8444 Large Language Models have shown promising results in their ability to encode general medical knowledge in standard medical question-answering datasets. However, their potential application in clinical practice requires evaluation in domain-specific tasks, where benchmarks are largely missing. In this study semioLLM, we test the ability of state-of-the-art LLMs (GPT-3.5, GPT-4, Mixtral 8x7B, and Qwen-72chat) to leverage their internal knowledge and reasoning for epilepsy diagnosis. Specifically, we obtain likelihood estimates linking unstructured text descriptions of seizures to seizure-generating brain regions, using an annotated clinical database containing 1269 entries. We evaluate the LLM's performance, confidence, reasoning, and citation abilities in comparison to clinical evaluation. Models achieve above-chance classification performance with prompt engineering significantly improving their outcome, with some models achieving close-to-clinical performance and reasoning. However, our analyses also reveal significant pitfalls with several models being overly confident while showing poor performance, as well as exhibiting citation errors and hallucinations. In summary, our work provides the first extensive benchmark comparing current SOTA LLMs in the medical domain of epilepsy and highlights their ability to leverage unstructured texts from patients' medical history to aid diagnostic processes in health care. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#652
-
Das 2018
Phrase2VecGLM: Neural generalized language model-based semantic tagging for complex query reformulation in medical IR
SIGBioMed 17th Workshop on Biomedical Natural Language Processing (BioNLP) 2018;():118-128 Melbourne, AUSTRALIA Assoc Computational Linguistics-Acl 2018 Ref ID: 3287 In fact-based information retrieval, state-of-the-art performance is traditionally achieved by knowledge graphs driven by knowledge bases, as they can represent facts about and capture relationships between entities very well. However, in domains such as medical information retrieval, where addressing specific information needs of complex queries may require understanding query intent by capturing novel associations between potentially latent concepts, these systems can fall short. In this work, we develop a novel, completely unsupervised, neural language model-based ranking approach for semantic tagging of documents, using the document to be tagged as a query into the model to retrieve candidate phrases from top-ranked related documents, thus associating every document with novel related concepts extracted from the text. For this we extend the word embedding-based generalized language model (GLM) due to (Ganguly et al., 2015), to employ phrasal embeddings, and use the semantic tags thus obtained for downstream query expansion, both directly and in feedback loop settings. Our method, evaluated using the TREC 2016 clinical decision support challenge dataset, shows statistically significant improvement not only over various baselines that use standard MeSH terms and UMLS concepts for query expansion, but also over baselines using human expert-assigned concept tags for the queries, on top of a standard Okapi BM25-based document retrieval system. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#965
-
Das 2024
AcKnowledge: Acquired Knowledge Representation by Small Language Model Without Pre-training
KnowLLM 2024 - 1st Workshop on Towards Knowledgeable Language Models, Proceedings of the Workshop 2024;():83-95 Association for Computational Linguistics (ACL) 2024 Ref ID: 4302 Large language models (LLMs) are pre-trained on enormous amounts of text data and show acclaimed success in knowledge representation. However, there are two bottlenecks with this approach. (1) Pre-training data cannot be regularly updated once the models are deployed, and it is not very fruitful if the model cannot represent updated knowledge. (2) The consistently increasing size and computational resources make it difficult for noncommercial and individual researchers to fine-tune and scale these language models. Major LLMs with external knowledge are also proprietary. In this paper, we propose AcKnowledge, a framework wrapped around a small, non-pre-trained language model for an open-domain question-answering (QA) experiment. AcKnowledge learns relevant knowledge from the internet via meta-learning based on user questions, and re-learns from user feedback if knowledge is misrepresented. Our efficient knowledge representation framework avoids pre-training overhead while enabling updated information. Benchmarking shows competitive performance against similarly sized state-of-the-art (SoTA) LLMs on gold standard QA datasets, demonstrating the potential of integrating internet search and user feedback for improved performance and general-izability. The repository of the work is available at https://github.com/SouravD-Me/AcKnowledge-KnowledgeLM-ACL-2024. © 2024 Association for Computational Linguistics. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2261
-
Dasgupta 2013
A comprehensive sensor taxonomy and semantic knowledge representation: Energy meter use case
2013 Seventh International Conference on Sensing Technology (ICST) 2013;():791-799 2013 DOI: 10.1109/ICSensT.2013.6727761 · Ref ID: 6109 The increasing use of sensors and their observations in applications like environmental monitoring, security and surveillance, health care, infrastructure, meteorology and others not only generate huge amount of sensor data but also increase complexity of integration of heterogeneous sensor devices, their data formats and procedures of measurements. Therefore ways to manage sensors, sensing devices and systems and thereby handling generation of large volume of sensor data is becoming very important. Formal definition of sensor data encodings and web services to store and access them given by Sensor Web Enablement (SWE) initiative of Open Geospatial Consortium (OGC) provide syntactic interoperability but collecting, reasoning, querying on sensors and their observations require sensor semantic compatibility. It allows users to work with domain concepts, their relations and restrictions, which is an abstraction above the technical nitty-gritty of diverse sensor data format and their integration. The paper describes various sensor concepts and their relationships extending IEEE SUMO upper level ontology and OntoSensor, including SensorML and classifies sensor information into five major sensor knowledge representation (1) hierarchy (2) data (3) function (4) data exchange and (5) domain specific along with code snippets of semantic services generated by mapping between conceptual relationships with structural relationships described in object oriented languages like C++ or Java. |
Mike
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3284
-
Datta 2024
Construction of Hyper-Relational Knowledge Graphs Using Pre-Trained Large Language Models
arXiv 2024;(): 2024 Ref ID: 8188 Extracting hyper-relations is crucial for constructing comprehensive knowledge graphs, but there are limited supervised methods available for this task. To address this gap, we introduce a zero-shot prompt-based method using OpenAI's GPT-3.5 model for extracting hyper-relational knowledge from text. Comparing our model with a baseline, we achieved promising results, with a recall of 0.77. Although our precision is currently lower, a detailed analysis of the model outputs has uncovered potential pathways for future research in this area. |
Davis
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1365
-
Datta 2023
GREAT AI in Medical Appropriateness and Value-Based-Care
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2023;14418 LNCS():16-33 Springer Science and Business Media Deutschland GmbH 2023 DOI: 10.1007/978-3-031-49601-1_2 · Ref ID: 5038 Fee For Service, also known as Volume Based Care (VBC) model of healthcare encourages service volume – more service more reward. This model of care results in unnecessary, inappropriate, and wasted medical services. In the US, Fraud, Waste, and Abuse (FWA) ranges between $760 billion to $935 billion, accounting for approximately 25% of total healthcare spending. In India, the waste caused by FWA is estimated to be as high as 35%. This is due to a lack of smart digital health, absence of AI models, and lack of preventive vigilance against inappropriate medical interventions. Inappropriate medical intervention costs valuable resources and causes patient harm. This paper proposes GREAT AI (Generative, Responsible, Explainable, Adaptive, and Trustworthy Artificial Intelligence) in Medical Appropriateness. We show how GREAT AI is used to offer appropriate medical services. Moreover, we show how GREAT AI can function in vigilance role to curb FWA. We present two GREAT AI models namely MAKG (Medical Appropriateness Knowledge Graph) and RAG-GPT (Retrieval Augmented Generation – Generative Pretrained Transformer). MAKG is used as an autonomous coarse-grained medical-inappropriateness vigilance model for payers and regulators. Whereas RAG-GPT is used as a fine-grained LLM, with human-in-the-loop for medical appropriateness and medical inappropriateness model where the actor human-in-the loop can be anybody like providers, patients, payers, regulators, funders, or researchers. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#991
-
deÁvilaMendes 2024
Application of Generative AI as an Enterprise Wikibase Knowledge Graph Q&A System
KaLLM 2024 - 1st Workshop on Knowledge Graphs and Large Language Models, Proceedings of the Workshop 2024;():35-42 Association for Computational Linguistics (ACL) 2024 Ref ID: 4398 Generative AI and Large Language Models are increasingly used in business contexts. One application involves natural language conversations contextualized by company data, which can be accomplished by Enterprise Knowledge Graphs, standardized representations of data. This paper outlines an architecture for implementation of an Enterprise Knowledge Graph using open-source Wikibase software. Additionally, it is presented a Knowledge Graph Q&A System powered by Generative AI. ©2024 Association for Computational Linguistics. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1865
-
DeBellis 2023
Semantic Interpretation of BERT embeddings with Knowledge Graphs
CEUR Workshop Proceedings 2023;3478():181-191 CEUR-WS 2023 Ref ID: 5300 Pretrained language models have transformed the way we process natural languages, enhancing the performance of related systems. BERT has played a pivotal role in revolutionizing the field of Natural Language Processing (NLP). However, the deep learning framework behind BERT lacks interpretability. Recent research has focused on explaining the knowledge BERT acquires from the textual sources used for pre-training its linguistic model. In this study, we analyze the latent vector space produced by BERT's context-aware word embeddings. Our aim is to determine whether certain areas of the BERT vector space have an explicit meaning related to a Knowledge Graph (KG). Using the Link Prediction (LP) task, we demonstrate the presence of explicit and meaningful regions of the BERT vector space. Moreover, we establish links between BERT's vector space and specific ontology concepts in the KG by learning classification patterns. To the best of our knowledge, this is the first attempt to interpret BERT's learned linguistic knowledge through a KG by relying on its pre-trained context-aware word embeddings. © 2023 CEUR-WS. All rights reserved. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1889
-
deSá 2024
Socio-cultural adapted chatbots: Harnessing Knowledge Graphs and Large Language Models for enhanced context awareness
TEICAI 2024 - 1st Workshop Towards Ethical and Inclusive Conversational AI: Language Attitudes, Linguistic Diversity, and Language Rights, Proceedings of the Workshop 2024;():21-27 Association for Computational Linguistics (ACL) 2024 Ref ID: 4662 Understanding the socio-cultural context is crucial in machine translation (MT). Although conversational AI systems and chatbots, in particular, are not designed for translation, they can be used for MT purposes. Yet, chatbots often struggle to identify any socio-cultural context during user interactions. In this paper, we highlight this challenge with real-world examples from popular chatbots. We advocate for the use of knowledge graphs as an external source of information that can potentially encapsulate socio-cultural contexts, aiding chatbots in enhancing translation. We further present a method to exploit external knowledge and extract contextual information that can significantly improve text translation, as evidenced by our interactions with these chatbots. © 2024 Association for Computational Linguistics. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2517
-
Decker 2007
A Graphical Notation for Modeling Complex Events in Business Processes
11th IEEE International Enterprise Distributed Object Computing Conference (EDOC 2007) 2007;():27-27 2007 DOI: 10.1109/EDOC.2007.41 · Ref ID: 6615 Using complex event rules for capturing dependencies between business processes is an emerging trend in enterprise information systems. In previous work we have identified a set of requirements for event extensions for business process modeling languages. This paper introduces a graphical language for modeling composite events in business processes, namely BEMN, that fulfills all these requirements. These include event conjunction, disjunction and inhibition as well as cardinality of events whose graphical expression can be factored into flow-oriented process modeling and event rule modeling. Formal semantics for the language are provided. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2083
-
Dehghani 2014
An abstract methodology for developing knowledge management systems
2014 10th International Conference on Innovations in Information Technology (IIT) 2014;():110-115 2014 DOI: 10.1109/INNOVATIONS.2014.6987572 · Ref ID: 6066 Powerful organizations are those that manage their power factors efficiently; organizational resources are considered vital power factors, and Knowledge is one of the most important resources to manage. There is no universally accepted Knowledge Management (KM) process, but it is known that establishing the appropriate knowledge flows in the organization is the main goal of organizational KM. A Knowledge Management System (KMS) is an information system which supports the KM process, mainly by providing the required knowledge and enhancing its flow. Organizations increasingly feel the need for appropriate methodologies for developing their target KMSs. However, existing KMS development methodologies are not comprehensive enough to satisfy all organizational needs. In this paper, we propose an abstract KMS development methodology which alleviates the weaknesses of existing methodologies while reusing their strengths. Method engineers can develop concrete methodologies by instantiating the proposed abstract methodology and adding the necessary detail, thus producing bespoke methodologies which are best suited to organizational needs. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3767
-
Deng 2023
PK-Chat: Pointer Network Guided Knowledge Driven Generative Dialogue Model
arXiv 2023;(): 2023 Ref ID: 7668 In the research of end-to-end dialogue systems, using real-world knowledge to generate natural, fluent, and human-like utterances with correct answers is crucial. However, domain-specific conversational dialogue systems may be incoherent and introduce erroneous external information to answer questions due to the out-of-vocabulary issue or the wrong knowledge from the parameters of the neural network. In this work, we propose PK-Chat, a Pointer network guided Knowledge-driven generative dialogue model, incorporating a unified pretrained language model and a pointer network over knowledge graphs. The words generated by PK-Chat in the dialogue are derived from the prediction of word lists and the direct prediction of the external knowledge graph knowledge. Moreover, based on the PK-Chat, a dialogue system is built for academic scenarios in the case of geosciences. Finally, an academic dialogue benchmark is constructed to evaluate the quality of dialogue systems in academic scenarios and the source code is available online. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2149
-
Deng 2023
An Artificial Intelligence Model Recommendation Method for Power Dispatching Scenario Based on Knowledge Graph and Scene Label Matching
2023 IEEE 11th Joint International Information Technology and Artificial Intelligence Conference (ITAIC) 2023;11():1151-1155 2023 DOI: 10.1109/ITAIC58329.2023.10409033 · Ref ID: 6024 In the field of power dispatching, more and more tasks have adopted artificial intelligence solutions, and related research and literature are also showing an exponential explosive growth. In order to solve the problem of artificial intelligence information overload in power dispatching scenario, and to facilitate researchers with experience in power dispatching but lacking experience in artificial intelligence to use related algorithm models more effectively and conveniently in their work, this paper constructs a domain knowledge graph of artificial intelligence models for power dispatching scenario, and proposes an artificial intelligence model recommendation method based on knowledge graph and scene label matching. According to the cosine similarity of scene labels, the artificial intelligence model mapped on the knowledge graph is recommended. This method can be applied to the actual power dispatching scenario to provide personalized artificial intelligence model recommendation for different scientific research tasks, which greatly improves the retrieval efficiency of relevant researchers. |
Mike
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3469
-
Dernbach 2024
GLaM: Fine-Tuning Large Language Models for Domain Knowledge Graph Alignment via Neighborhood Partitioning and Generative Subgraph Encoding
arXiv 2024;(): 2024 Ref ID: 8088 Integrating large language models (LLMs) with knowledge graphs derived from domain-specific data represents an important advancement towards more powerful and factual reasoning. As these models grow more capable, it is crucial to enable them to perform multi-step inferences over real-world knowledge graphs while minimizing hallucination. While large language models excel at conversation and text generation, their ability to reason over domain-specialized graphs of interconnected entities remains limited. For example, can we query a LLM to identify the optimal contact in a professional network for a specific goal, based on relationships and attributes in a private database? The answer is no–such capabilities lie beyond current methods. However, this question underscores a critical technical gap that must be addressed. Many high-value applications in areas such as science, security, and e-commerce rely on proprietary knowledge graphs encoding unique structures, relationships, and logical constraints. We introduce a fine-tuning framework for developing Graph-aligned LAnguage Models (GLaM) that transforms a knowledge graph into an alternate text representation with labeled question-answer pairs. We demonstrate that grounding the models in specific graph-based knowledge expands the models' capacity for structure-based reasoning. Our methodology leverages the large-language model's generative capabilities to create the dataset and proposes an efficient alternate to retrieval-augmented generation styled methods. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1903
-
Deshpande 2022
StereoKG: Data-Driven Knowledge Graph Construction for Cultural Knowledge and Stereotypes
WOAH 2022 - 6th Workshop on Online Abuse and Harms, Proceedings of the Workshop 2022;():67-78 Association for Computational Linguistics (ACL) 2022 Ref ID: 5457 Analyzing ethnic or religious bias is important for improving fairness, accountability, and transparency of natural language processing models. However, many techniques rely on human-compiled lists of bias terms, which are expensive to create and are limited in coverage. In this study, we present a fully data-driven pipeline for generating a knowledge graph (KG) of cultural knowledge and stereotypes. Our resulting KG covers 5 religious groups and 5 nationalities and can easily be extended to include more entities. Our human evaluation shows that the majority (59.2%) of non-singleton entries are coherent and complete stereotypes. We further show that performing intermediate masked language model training on the verbalized KG leads to a higher level of cultural awareness in the model and has the potential to increase classification performance on knowledge-crucial samples on a related task, i.e., hate speech detection. © 2022 Association for Computational Linguistics. |
Mike
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3206
-
Díaz 2024
Automatic knowledge-graph creation from historical documents: The Chilean dictatorship as a case study
arXiv 2024;(): 2024 Ref ID: 8552 We present our results regarding the automatic construction of a knowledge graph from historical documents related to the Chilean dictatorship period (1973-1990). Our approach consists on using LLMs to automatically recognize entities and relations between these entities, and also to perform resolution between these sets of values. In order to prevent hallucination, the interaction with the LLM is grounded in a simple ontology with 4 types of entities and 7 types of relations. To evaluate our architecture, we use a gold standard graph constructed using a small subset of the documents, and compare this to the graph obtained from our approach when processing the same set of documents. Results show that the automatic construction manages to recognize a good portion of all the entities in the gold standard, and that those not recognized are mostly explained by the level of granularity in which the information is structured in the graph, and not because the automatic approach misses an important entity in the graph. Looking forward, we expect this report will encourage work on other similar projects focused on enhancing research in humanities and social science, but we remark that better evaluation metrics are needed in order to accurately fine-tune these types of architectures. |
Kwesi
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2346
-
Dietrich 2014
Distributed management and representation of data and context in robotic applications
2014 IEEE/RSJ International Conference on Intelligent Robots and Systems 2014;():1133-1140 2014 DOI: 10.1109/IROS.2014.6942700 · Ref ID: 6755 The traditional, isolated data handling in sensor-actuator systems does not fulfill the requirements of robots that need to interact with their smart environment. Consequently, we have to develop new mechanisms for adaptive data and context handling. We firstly investigate what types of data are present within smart environments and how they can be classified and organized. Only if the available data can be structured, it can be queried and thus put into context. This is important because the variety of data and possible interpretations is tremendous, ranging from measurement values, sensor and robot descriptions/states/commands, to environmental data, such as positions, maps, spatial relations, etc. To cope with this diversity, we developed a solution capable of storing and accessing data within a distributed environment by providing additional context information. Furthermore, we describe how this information can be assembled in a task-oriented manner. This enables robots to dynamically generate environmental abstractions by using data from different sources and also enables them to incorporate external sensor measurements. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1978
-
Dietze 2023
Towards syntax-aware pretraining and prompt engineering for knowledge retrieval from large language models
CEUR Workshop Proceedings 2023;3577(): CEUR-WS 2023 Ref ID: 5080 The ability to access relational knowledge from LLM parameters, known as relational knowledge retrieval (rKR), is considered a critical factor in their capacity to comprehend and interpret natural language. However, the role of syntax in this context has not been adequately explored. In this position paper, we hypothesize a close link between the accessibility of relational knowledge and syntax. We discuss related works and lay out a research agenda focused on rKR from self-supervised LLMs without or with minimal fine-tuning and aiming at understanding the impact of syntax on rKR. This involves examining biases, factors affecting result reliability and robustness, and analyzing the effect of syntactic features in training corpora on rKR. We argue that rKR can be improved through syntax-aware pretraining and prompt engineering, and propose a dedicated research agenda geared toward exploring the impact of syntax on knowledge retrieval. © 2023 CEUR-WS. All rights reserved. |
Ishan
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2635
-
Dillon 2001
Lightweight analysis of operational specifications using inference graphs
Proceedings of the 23rd International Conference on Software Engineering. ICSE 2001 2001;():57-67 2001 DOI: 10.1109/ICSE.2001.919081 · Ref ID: 6796 The Amalia framework generates lightweight components that automate the analysis of operational specifications and designs. A key concept is the step analyzer, which enables Amalia to automatically tailor high-level analyses, such as behavior simulation and model checking, to different specification languages and representations. A step analyzer uses a new abstraction, called an inference graph, for the analysis. It creates and evaluates an inference graph on-the-fly during a top-down traversal of a specification to deduce the specification's local behaviors (called steps). The nodes of an inference graph directly reify the rules in an operational semantics, enabling Amalia to automatically generate a step analyzer from an operational description of a notation's semantics. Inference graphs are a clean abstraction that can be formally defined. The paper provides a detailed but informal introduction to inference graphs. It uses example specifications written in LOTOS for purposes of illustration. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#519
-
Ding 2024
Leveraging Chain-of-Thought to Enhance Stance Detection with Prompt-Tuning
Investigating public attitudes towards social media is crucial for opinion mining systems to gain valuable insights. Stance detection, which aims to discern the attitude expressed in an opinionated text towards a specific target, is a fundamental task in opinion mining. Conventional approaches mainly focus on sentence-level classification techniques. Recent research has shown that the integration of background knowledge can significantly improve stance detection performance. Despite the significant improvement achieved by knowledge-enhanced methods, applying these techniques in real-world scenarios remains challenging for several reasons. Firstly, existing methods often require the use of complex attention mechanisms to filter out noise and extract relevant background knowledge, which involves significant annotation efforts. Secondly, knowledge fusion mechanisms typically rely on fine-tuning, which can introduce a gap between the pre-training phase of pre-trained language models (PLMs) and the downstream stance detection tasks, leading to the poor prediction accuracy of the PLMs. To address these limitations, we propose a novel prompt-based stance detection method that leverages the knowledge acquired using the chain-of-thought method, which we refer to as PSDCOT. The proposed approach consists of two stages. The first stage is knowledge extraction, where instruction questions are constructed to elicit background knowledge from a VLPLM. The second stage is the multi-prompt learning network (M-PLN) for knowledge fusion, which learns model performance based on the background knowledge and the prompt learning framework. We evaluated the performance of PSDCOT on publicly available benchmark datasets to assess its effectiveness in improving stance detection performance. The results demonstrate that the proposed method achieves state-of-the-art results in in-domain, cross-target, and zero-shot learning settings. |
Kwesi
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3173
-
Ding 2024
3DS: Decomposed Difficulty Data Selection's Case Study on LLM Medical Domain Adaptation
arXiv 2024;(): 2024 Ref ID: 8704 Large Language Models(LLMs) excel in general tasks but struggle in specialized domains like healthcare due to limited domain-specific knowledge.Supervised Fine-Tuning(SFT) data construction for domain adaptation often relies on heuristic methods, such as GPT-4 annotation or manual data selection, with a data-centric focus on presumed diverse, high-quality datasets. However, these methods overlook the model's inherent knowledge distribution, introducing noise, redundancy, and irrelevant data, leading to a mismatch between the selected data and the model's learning task, resulting in suboptimal performance. To address this, we propose a two-stage model-centric data selection framework, Decomposed Difficulty Data Selection (3DS), which aligns data with the model's knowledge distribution for optimized adaptation. In Stage1, we apply Prompt-Driven Data Selection via Explicit Alignment, where the the model filters irrelevant or redundant data based on its internal knowledge. In Stage2, we perform Decomposed Difficulty Data Selection, where data selection is guided by our defined difficulty decomposition, using three metrics: Instruction Understanding, Response Confidence, and Response Correctness. Additionally, an attention-based importance weighting mechanism captures token importance for more accurate difficulty calibration. This two-stage approach ensures the selected data is not only aligned with the model's knowledge and preferences but also appropriately challenging for the model to learn, leading to more effective and targeted domain adaptation. In the case study of the medical domain, our extensive experiments on real-world healthcare datasets demonstrate the superiority of 3DS over exisiting methods in accuracy by over 5.29%. Our dataset and code will be open-sourced at https://anonymous.4open.science/r/3DS-E67F. |
Xinchen
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3203
-
Ding 2024
Automated Construction of Theme-specific Knowledge Graphs
arXiv 2024;(): 2024 Ref ID: 8262 Despite widespread applications of knowledge graphs (KGs) in various tasks such as question answering and intelligent conversational systems, existing KGs face two major challenges: information granularity and deficiency in timeliness. These hinder considerably the retrieval and analysis of in-context, fine-grained, and up-to-date knowledge from KGs, particularly in highly specialized themes (e.g., specialized scientific research) and rapidly evolving contexts (e.g., breaking news or disaster tracking). To tackle such challenges, we propose a theme-specific knowledge graph (i.e., ThemeKG), a KG constructed from a theme-specific corpus, and design an unsupervised framework for ThemeKG construction (named TKGCon). The framework takes raw theme-specific corpus and generates a high-quality KG that includes salient entities and relations under the theme. Specifically, we start with an entity ontology of the theme from Wikipedia, based on which we then generate candidate relations by Large Language Models (LLMs) to construct a relation ontology. To parse the documents from the theme corpus, we first map the extracted entity pairs to the ontology and retrieve the candidate relations. Finally, we incorporate the context and ontology to consolidate the relations for entity pairs. We observe that directly prompting GPT-4 for theme-specific KG leads to inaccurate entities (such as "two main types" as one entity in the query result) and unclear (such as "is", "has") or wrong relations (such as "have due to", "to start"). In contrast, by constructing the theme-specific KG step by step, our model outperforms GPT-4 and could consistently identify accurate entities and relations. Experimental results also show that our framework excels in evaluations compared with various KG construction baselines. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1351
-
Ding 2023
Generative Semantic Modeling for Structured Data Source with Large Language Model
Proceedings - 2023 IEEE International Conference on High Performance Computing and Communications, Data Science and Systems, Smart City and Dependability in Sensor, Cloud and Big Data Systems and Application, HPCC/DSS/SmartCity/DependSys 2023 2023;():1148-1152 Institute of Electrical and Electronics Engineers Inc. 2023 DOI: 10.1109/HPCC-DSS-SmartCity-DependSys60770.2023.00164 · Ref ID: 4947 The paper introduces a generative semantic model for representing human knowledge in a way that enables computer understanding and reasoning. The current approach to semantic modeling involves mapping between the space of plausible semantic models and the provided data source. However, this approach has limitations, as the score functions used to search for the best candidate semantic model are either trained on a specific integration knowledge graph or rely on manually designed features. To address these limitations, the authors propose a new approach that combines an encoder made with a pre-trained large language model (LLM) with a graph decoder customized to generate semantics. The encoder-decoder system is designed to be trained on knowledge graphs, and the authors introduce an algorithm to generate training samples from the big knowledge graph by decomposing training samples into construction actions using a method similar to the transition system of the Syntax Parser. The proposed method is novel, as it is the first time a generative method has been applied to the semantic modeling task, empowered with an LLM, and trained on knowledge graphs to achieve better performance on standard benchmarks than in past work. In conclusion, the proposed generative semantic model offers a promising new approach to representing and organizing human knowledge in a more generalizable way, using a combination of a pre-trained LLM and a customized graph decoder trained on knowledge graphs. The approach has shown improved performance on standard benchmarks and has the potential to advance the field of semantic modeling. © 2023 IEEE. |
Kwesi
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2007
-
Ding 2023
A Unified Knowledge Graph Augmentation Service for Boosting Domain-specific NLP Tasks
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2023;():353-369 Association for Computational Linguistics (ACL) 2023 DOI: 10.18653/v1/2023.findings-acl.24 · Ref ID: 5173 By focusing the pre-training process on domain-specific corpora, some domain-specific pre-trained language models (PLMs) have achieved state-of-the-art results. However, it is under-investigated to design a unified paradigm to inject domain knowledge in the PLM fine-tuning stage. We propose KnowledgeDA, a unified domain language model development service to enhance the task-specific training procedure with domain knowledge graphs. Given domain-specific task texts input, KnowledgeDA can automatically generate a domain-specific language model following three steps: (i) localize domain knowledge entities in texts via an embedding-similarity approach; (ii) generate augmented samples by retrieving replaceable domain entity pairs from two views of both knowledge graph and training data; (iii) select high-quality augmented samples for fine-tuning via confidence-based assessment. We implement a prototype of KnowledgeDA to learn language models for two domains, healthcare and software development. Experiments on domain-specific text classification and QA tasks verify the effectiveness and generalizability of KnowledgeDA. © 2023 Association for Computational Linguistics. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2050
-
Ding 2024
zrLLM: Zero-Shot Relational Learning on Temporal Knowledge Graphs with Large Language Models
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2024 2024;1():1877-1895 Association for Computational Linguistics (ACL) 2024 Ref ID: 4562 Modeling evolving knowledge over temporal knowledge graphs (TKGs) has become a heated topic. Various methods have been proposed to forecast links on TKGs. Most of them are embedding-based, where hidden representations are learned to represent knowledge graph (KG) entities and relations based on the observed graph contexts. Although these methods show strong performance on traditional TKG forecasting (TKGF) benchmarks, they face a strong challenge in modeling the unseen zero-shot relations that have no prior graph context. In this paper, we try to mitigate this problem as follows. We first input the text descriptions of KG relations into large language models (LLMs) for generating relation representations, and then introduce them into embedding-based TKGF methods. LLM-empowered representations can capture the semantic information in the relation descriptions. This makes the relations, whether seen or unseen, with similar semantic meanings stay close in the embedding space, enabling TKGF models to recognize zero-shot relations even without any observed graph context. Experimental results show that our approach helps TKGF models to achieve much better performance in forecasting the facts with previously unseen relations, while still maintaining their ability in link forecasting regarding seen relations. © 2024 Association for Computational Linguistics. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1403
-
Ding 2024
Improving Recall of Large Language Models: A Model Collaboration Approach for Relational Triple Extraction
2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():8890-8901 European Language Resources Association (ELRA) 2024 Ref ID: 4559 Relation triple extraction, which outputs a set of triples from long sentences, plays a vital role in knowledge acquisition. Large language models can accurately extract triples from simple sentences through few-shot learning or fine-tuning when given appropriate instructions. However, they often miss out when extracting from complex sentences. In this paper, we design an evaluation-filtering framework that integrates large language models with small models for relational triple extraction tasks. The framework includes an evaluation model that can extract related entity pairs with high precision. We propose a simple labeling principle and a deep neural network to build the model, embedding the outputs as prompts into the extraction process of the large model. We conduct extensive experiments to demonstrate that the proposed method can assist large language models in obtaining more accurate extraction results, especially from complex sentences containing multiple relational triples. Our evaluation model can also be embedded into traditional extraction models to enhance their extraction precision from complex sentences. © 2024 ELRA Language Resource Association: CC BY-NC 4.0. |
Mike
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2622
-
Djemai 2024
Knowledge-based Reactive Planning and Re-planning – A Case-Study Approach
2024 IEEE Conference on Artificial Intelligence (CAI) 2024;():770-775 2024 DOI: 10.1109/CAI59869.2024.00147 · Ref ID: 6155 When a disaster strikes, man-made or natural, evacuation plans are put under immediate constraints, including topological, temporal, and spontaneously occurring events such as fire, smoke and obstacles introducing bottlenecks and impeding ingress and egress. Planning for uncertainties arising from indoor evacuations can be complex as there’s a fine balance to strike between a too-detailed plan and one that’s too vague. Such constraints apply to office and residential buildings, airports, mining sites, stadiums, ships, etc. Although some indoor spatial models have been developed, many are complex, and their applicability is non-universal. This paper proposes an innovative approach that harnesses the power of the Semantic Web Rule Language (SWRL) based on Web Ontology Language (OWL) to enhance existing evacuation planning methods through data-rich modelling. The OWL ontology serves as a formal representation of real-world concepts, their relationships, and properties. To demonstrate its application, the ontology is implemented in a case study involving London Metropolitan University’s Tower Building, and its design is elucidated in this paper. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1240
-
Dobriy 2024
Employing RAG to Create a Conference Knowledge Graph from Text
CEUR Workshop Proceedings 2024;3747():18 CEUR-WS 2024 Ref ID: 4345 In this paper, we present Semantic Observer, a platform that 1) defines a FAIR Conference Ontology for describing academic conferences, 2) presents an RAG architecture that constructs a Conference Knowledge Graph based on this ontology, 3) evaluates the architecture on a corpus of latest available CORE conference websites. The Conference Ontology models key entities such as conferences, workshops and challenges, organizer and programme committees, calls for papers and proposals as well as major deadlines and relevant topics. In the evaluation, we compare the performance of three leading Large Language Models: GPT-4 Turbo and Claude 3 Opus - in supporting the Knowledge Graph construction from text. The best-performing RAG architecture is then implemented in Semantic Observer and available in a SPARQL endpoint to make up-to-date conference information FAIR: findable, accessible, interoperable and reusable. © 2024 Copyright for this paper by its authors. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1651
-
Dong 2024
Modality-Aware Integration with Large Language Models for Knowledge-based Visual Question Answering
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;1():2417-2429 Association for Computational Linguistics (ACL) 2024 Ref ID: 4316 Knowledge-based visual question answering (KVQA) has been extensively studied to answer visual questions with external knowledge, e.g., knowledge graphs (KGs). While several attempts have been proposed to leverage large language models (LLMs) as an implicit knowledge source, it remains challenging since LLMs may generate hallucinations. Moreover, multiple knowledge sources, e.g., images, KGs and LLMs, cannot be readily aligned for complex scenarios. To tackle these, we present a novel modality-aware integration with LLMs for KVQA (MAIL). It carefully leverages multimodal knowledge for both image understanding and knowledge reasoning. Specifically, (i) we propose a two-stage prompting strategy with LLMs to densely embody the image into a scene graph with detailed visual features; (ii) We construct a coupled concept graph by linking the mentioned entities with external facts. (iii) A tailored pseudo-siamese graph medium fusion is designed for sufficient multimodal fusion. We utilize the shared mentioned entities in two graphs as mediums to bridge a tight inter-modal exchange, while maximally preserving insightful intra-modal learning by constraining the fusion within mediums. Extensive experiments show the superiority of MAIL. © 2024 Association for Computational Linguistics. |
Kwesi
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3157
-
Dou 2024
ShennongMGS: An LLM-based Chinese Medication Guidance System
|
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#757
-
Du 2024
Semantic-enhanced reasoning question answering over temporal knowledge graphs
Question Answering Over Temporal Knowledge Graphs (TKGQA) is an important topic in question answering. TKGQA focuses on accurately understanding questions involving temporal constraints and retrieving accurate answers from knowledge graphs. In previous research, the hierarchical structure of question contexts and the constraints imposed by temporal information on different sentence components have been overlooked. In this paper, we propose a framework called "Semantic-Enhanced Reasoning Question Answering" (SERQA) to tackle this problem. First, we adopt a pretrained language model (LM) to obtain the question relation representation vector. Then, we leverage syntactic information from the constituent tree and dependency tree, in combination with Masked Self-Attention (MSA), to enhance temporal constraint features. Finally, we integrate the temporal constraint features into the question relation representation using an information fusion function for answer prediction. Experimental results demonstrate that SERQA achieves better performance on the CRONQUESTIONS and ImConstrainedQuestions datasets. In comparison with existing temporal KGQA methods, our model exhibits outstanding performance in comprehending temporal constraint questions. The ablation experiments verified the effectiveness of combining the constituent tree and the dependency tree with MSA in question answering. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3543
-
Du 2024
Internal and External Knowledge Interactive Refinement Framework for Knowledge-Intensive Question Answering
arXiv 2024;(): 2024 Ref ID: 8555 Recent works have attempted to integrate external knowledge into LLMs to address the limitations and potential factual errors in LLM-generated content. However, how to retrieve the correct knowledge from the large amount of external knowledge imposes a challenge. To this end, we empirically observe that LLMs have already encoded rich knowledge in their pretrained parameters and utilizing these internal knowledge improves the retrieval of external knowledge when applying them to knowledge-intensive tasks. In this paper, we propose a new internal and external knowledge interactive refinement paradigm dubbed IEKR to utilize internal knowledge in LLM to help retrieve relevant knowledge from the external knowledge base, as well as exploit the external knowledge to refine the hallucination of generated internal knowledge. By simply adding a prompt like 'Tell me something about' to the LLMs, we try to review related explicit knowledge and insert them with the query into the retriever for external retrieval. The external knowledge is utilized to complement the internal knowledge into input of LLM for answers. We conduct experiments on 3 benchmark datasets in knowledge-intensive question answering task with different LLMs and domains, achieving the new state-of-the-art. Further analysis shows the effectiveness of different modules in our approach. |
Kwesi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#415
-
Du 2023
KLDP:A Data Profiling Technique Based on Knowledge Graph and Large Language Modeling
IEEE 22nd International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom) / BigDataSE Conference / CSE Conference / EUC Conference / ISCI Conference 2023;():2333-2340 Exeter, ENGLAND Ieee Computer Soc 2023 DOI: 10.1109/TrustCom60117.2023.00329 · Ref ID: 3229 The explosive growth of medical data has perfected the establishment of patients' personal health records and provided favorable conditions for smart healthcare, but its fragmentation also brings challenges to patient management. Mainstream research focuses on utilizing medical data to construct disease knowledge graphs to assist patient management, but does not effectively manage massive patient data. In order to make full use of patient data and facilitate the circulation of patient data elements, we propose a new patient sketching technique, KLDP. it constructs knowledge graphs through pre-training techniques, effectively manages patient data based on patients' personal health records and medical history information throughout the treatment cycle, and elementalizes patient data, providing new ideas and implementation solutions for patient management. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#291
-
Du 2024
From Static to Dynamic: Knowledge Metabolism for Large Language Models
38th AAAI Conference on Artificial Intelligence (AAAI) / 36th Conference on Innovative Applications of Artificial Intelligence / 14th Symposium on Educational Advances in Artificial Intelligence 2024;():23784-23786 Vancouver, CANADA Assoc Advancement Artificial Intelligence 2024 Ref ID: 3567 The immense parameter space of Large Language Models (LLMs) endows them with superior knowledge retention capabilities, allowing them to excel in a variety of natural language processing tasks. However, it also instigates difficulties in consistently tuning LLMs to incorporate the most recent knowledge, which may further lead LLMs to produce inaccurate and fabricated content. To alleviate this issue, we propose a knowledge metabolism framework for LLMs, which proactively sustains the credibility of knowledge through an auxiliary memory component and directly delivers pertinent knowledge for LLM inference, thereby suppressing hallucinations caused by obsolete internal knowledge during the LLM inference process. Benchmark experiments demonstrate DynaMind's effectiveness in overcoming this challenge. The code and demo of DynaMind are available at: https://github.com/Elfsong/DynaMind. |
yuexi
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2048
-
Du 2024
ZhuJiu-Knowledge: A Fairer Platform for Evaluating Multiple Knowledge Types in Large Language Models
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2024 2024;3():194-206 Association for Computational Linguistics (ACL) 2024 Ref ID: 4491 The swift advancement in large language models (LLMs) has heightened the importance of model evaluations. LLMs have acquired a substantial amount of knowledge, and evaluating the knowledge of these LLMs is crucial. To address this, we introduce the ZhuJiu-Knowledge benchmark which carefully considers the following factors: (1) For knowledge scope, we concentrate on three domains: commonsense knowledge, world knowledge, language knowledge, which comes from ATOMIC, Conceptnet, Wikidata, and Wordnet. (2) For data construction, to prevent data contamination, we utilize knowledge derived from corpora and knowledge graphs to formulate novel questions that are ensured not to appear in the training corpus. A multitude of prompts is purposefully devised to mitigate the impact of prompt design on evaluation and to further analyze the LLMs’ sensitivity to various prompts. (3) For evaluation criteria, we propose a novel voting methodology for assessing generative text, aligning the model’s evaluation with human preferences to reduce biases inherent in individual model assessments. We evaluate 14 current mainstream LLMs and conduct a comprehensive discussion and analysis of their results. The ZhuJiu-Knowledge benchmark and open-participation leaderboard are publicly released at http://zhujiu-knowledge.top/ and we also provide a demo video at https://youtu.be/QJp4qlEHVH8. © 2024 Association for Computational Linguistics. |
Xinchen
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#698
-
Du 2015
Ranking Web Page with Path Trust Knowledge Graph
5th International Conference on Intelligence Science and Big Data Engineering (IScIDE) 2015;9243():66-75 Suzhou, PEOPLES R CHINA Springer International Publishing Ag 2015 DOI: 10.1007/978-3-319-23862-3_7 · Ref ID: 3180 How to find and discover useful information from Internet is a real challenge in information retrieval (IR) and search engines (SE). In this paper, we propose and construct Path Trust Knowledge Graph PTKG model for assigning priority values to the unvisited web pages. For a given user specific topic t, its PTKG contains five parts: (1) The context graph G(t) = (V, E), where V is the crawled history web page set and E includes the hyper link set among the history web pages; (2) Retrieving knowledge implied in the paths among these web pages and finding their lengths; (3) Building the trust degrees among the web pages; (4) Constructing topic specific language model and general language model by using the trust degrees; (5) Assigning the priority values of web pages for ranking them. Finally, we perform an experimental comparison among our proposed PTKG approach with the classic LCG and RCG. As a result, our method outperforms LCG and RCG. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#511
-
Duan 2021
Learning Numeracy: A Simple Yet Effective Number Embedding Approach Using Knowledge Graph
Meeting of the Association-for-Computational-Linguistics (ACL-EMNLP) 2021;():2597-2602 Punta Cana, DOMINICAN REP Assoc Computational Linguistics-Acl 2021 Ref ID: 2971 Numeracy plays a key role in natural language understanding. However, existing NLP approaches, either traditional word2vec approach or contextualized transformer-based language models, fail to learn numeracy. As the result, the performance of these models is limited when they are applied to number-intensive applications in clinical and financial domains. In this work, we propose a simple number embedding approach based on knowledge graph. We construct a knowledge graph consisting of number entities and magnitude relations. Knowledge graph embedding method is then applied to obtain number vectors. Our approach is easy to implement, and experiment results on various numeracy-related NLP tasks demonstrate the effectiveness and efficiency of our method. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#773
-
Duan 2023
Simple Knowledge Graph Completion Model Based on Differential Negative Sampling and Prompt Learning
Knowledge graphs (KGs) serve as a crucial resource for numerous artificial intelligence tasks, significantly contributing to the advancement of the AI field. However, the incompleteness of existing KGs hinders their effectiveness in practical applications. Consequently, researchers have proposed the task of KG completion. Currently, embedding-based techniques dominate the field as they leverage the structural information within KGs to infer and complete missing parts. Nonetheless, these methods exhibit limitations. They are limited by the quality and quantity of structural information and are unable to handle the missing entities in the original KG. To overcome these challenges, researchers have attempted to integrate pretrained language models and textual data to perform KG completion. This approach utilizes the definition statements and description text of entities within KGs. The goal is to compensate for the latent connections that are difficult for traditional methods to obtain. However, text-based methods still lag behind embedding-based models in terms of performance. Our analysis reveals that the critical issue lies in the selection process of negative samples. In order to enhance the performance of the text-based methods, various types of negative sampling methods are employed in this study. We introduced prompt learning to fill the gap between the pre-training language model and the knowledge graph completion task, and to improve the model reasoning level. Simultaneously, a ranking strategy based on KG structural information is proposed to utilize KG structured data to assist reasoning. The experiment results demonstrate that our model exhibits strong competitiveness and outstanding inference speed. By fully exploiting the internal structural information of KGs and external relevant descriptive text resources, we successfully elevate the performance levels of KG completion tasks across various metrics. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3883
-
Dunn 2022
Structured information extraction from complex scientific text with fine-tuned large language models
arXiv 2022;(): 2022 Ref ID: 7624 Intelligently extracting and linking complex scientific information from unstructured text is a challenging endeavor particularly for those inexperienced with natural language processing. Here, we present a simple sequence-to-sequence approach to joint named entity recognition and relation extraction for complex hierarchical information in scientific text. The approach leverages a pre-trained large language model (LLM), GPT-3, that is fine-tuned on approximately 500 pairs of prompts (inputs) and completions (outputs). Information is extracted either from single sentences or across sentences in abstracts/passages, and the output can be returned as simple English sentences or a more structured format, such as a list of JSON objects. We demonstrate that LLMs trained in this way are capable of accurately extracting useful records of complex scientific knowledge for three representative tasks in materials chemistry: linking dopants with their host materials, cataloging metal-organic frameworks, and general chemistry/phase/morphology/application information extraction. This approach represents a simple, accessible, and highly-flexible route to obtaining large databases of structured knowledge extracted from unstructured text. An online demo is available at http://www.matscholar.com/info-extraction. |
Davis
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1989
-
Dunn 2024
Transforming Generative Large Language Models' Limitations into Strengths using Gestalt: A Synergetic Approach to Mathematical Problem-Solving with Computational Engines
Proceedings of the Annual Hawaii International Conference on System Sciences 2024;():5185-5194 IEEE Computer Society 2024 Ref ID: 4446 This paper presents an innovative approach, known as Gestalt1, to enhance the mathematical problem-solving capabilities of Generative Large Language Models (GLLMs) while addressing their inherent limitations. Recognizing the inherent structure and discerning strength of GLLMs, the core of our approach strategically offloads computations, deterministic questions, and knowledge retrieval to external tools such as Wolfram Alpha and Python REPL. This critical augmentation not only mitigates GLLMs' variable reliability in these areas but also fortifies their innate strength-understanding the underlying structure of the problems at hand. With this novel implementation, GLLMs can harness the potential of external systems through well-structured queries, enabling them to make significant strides in problem-solving. In a preliminary evaluation, the Gestalt system demonstrates exceptional performance on a portion of the MATH benchmark dataset, achieving a state-of-the-art accuracy of 59.00%. In comparison, GPT-4 achieves an accuracy of 53.9% on the identical dataset. Through our augmentation approach, we aim to transform the limitations of GLLMs into their strengths, opening up exciting new possibilities not only in advanced mathematical problem-solving but also in various deterministic tasks such as medical diagnosis. © 2024 IEEE Computer Society. All rights reserved. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1415
-
Dura 2002
Information retrieval based on explicit knowledge representation
CEUR Workshop Proceedings 2002;1168(): CEUR-WS 2002 Ref ID: 5840 The tool which we tested in the present monolingual retrieval task, Lexware®, is based on explicit knowledge representation not on statistic language modeling. In the present task Lexware® indexing seems to be satisfactory while its query builder is not. The system has been tested extensively on indexing of Swedish parliamentary debates with very good results. We are happy that Swedish is finally introduced into CLEF, unfortunately the present test suite is not reliable. Swedish parliamentary debates may perhaps be used instead. They are many, constantly growing and they are thoroughly indexed manually with keywords chosen from a thesaurus of about 4000 items. Copyright © 2002 for the individual papers by the papers' authors. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#629
-
Durmaz 2024
An ontology-based text mining dataset for extraction of process-structure-property entities
While large language models learn sound statistical representations of the language and information therein, ontologies are symbolic knowledge representations that can complement the former ideally. Research at this critical intersection relies on datasets that intertwine ontologies and text corpora to enable training and comprehensive benchmarking of neurosymbolic models. We present the MaterioMiner dataset and the linked materials mechanics ontology where ontological concepts from the mechanics of materials domain are associated with textual entities within the literature corpus. Another distinctive feature of the dataset is its eminently fine-grained annotation. Specifically, 179 distinct classes are manually annotated by three raters within four publications, amounting to 2191 entities that were annotated and curated. Conceptual work is presented for the symbolic representation of causal composition-process-microstructure-property relationships. We explore the annotation consistency between the three raters and perform fine-tuning of pre-trained language models to showcase the feasibility of training named entity recognition models. Reusing the dataset can foster training and benchmarking of materials language models, automated ontology construction, and knowledge graph generation from textual data. |
Davis
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3973
-
Egami 2024
VHAKG: A Multi-modal Knowledge Graph Based on Synchronized Multi-view Videos of Daily Activities
arXiv 2024;(): 2024 Ref ID: 8563 Multi-modal knowledge graphs (MMKGs), which ground various non-symbolic data (e.g., images and videos) into symbols, have attracted attention as resources enabling knowledge processing and machine learning across modalities. However, the construction of MMKGs for videos consisting of multiple events, such as daily activities, is still in the early stages. In this paper, we construct an MMKG based on synchronized multi-view simulated videos of daily activities. Besides representing the content of daily life videos as event-centric knowledge, our MMKG also includes frame-by-frame fine-grained changes, such as bounding boxes within video frames. In addition, we provide support tools for querying our MMKG. As an application example, we demonstrate that our MMKG facilitates benchmarking vision-language models by providing the necessary vision-language datasets for a tailored task. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1638
-
Eggert 2023
Memory Net: Generalizable Common-Sense Reasoning over Real-World Actions and Objects
International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K - Proceedings 2023;2():182-189 Science and Technology Publications, Lda 2023 DOI: 10.5220/0012182300003598 · Ref ID: 5078 In this paper, we explore how artificial agents (AAs) can understand and reason about so called”action patterns” within real-world settings. Essentially, we want AAs to determine which tools fit specific actions, and which actions can be executed with certain tools, objects or agents, based on real-world situations. To achieve this, we utilize a comprehensive Knowledge Graph, called”Memory Net” filled with interconnected everyday concepts, common actions, and environmental data. Our approach involves an inference technique that harnesses semantic proximity through subgraph matching. Comparing our approach against human responses and a state-of-the-art natural language model based machine learning approach in a home scenario, our Knowledge Graph method demonstrated strong generalization capabilities, suggesting its promise in dynamic, incremental and interactive real world settings. Copyright © 2023 by SCITEPRESS – Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0). |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2500
-
Eisermann 2021
Generalization in Multimodal Language Learning from Simulation
2021 International Joint Conference on Neural Networks (IJCNN) 2021;():1-8 2021 DOI: 10.1109/IJCNN52387.2021.9534275 · Ref ID: 6357 Neural networks can be powerful function approximators, which are able to model high-dimensional feature distributions from a subset of examples drawn from the target distribution. Naturally, they perform well at generalizing within the limits of their target function, but they often fail to generalize outside of the explicitly learned feature space. It is therefore an open research topic whether and how neural network-based architectures can be deployed for systematic reasoning. Many studies have shown evidence for poor generalization, but they often work with abstract data or are limited to single-channel input. Humans, however, learn and interact through a combination of multiple sensory modalities, and rarely rely on just one. To investigate compositional generalization in a multimodal setting, we generate an extensible dataset with multimodal input sequences from simulation. We investigate the influence of the underlying training data distribution on compostional generalization in a minimal LSTM-based network trained in a supervised, time continuous setting. We find compositional generalization to fail in simple setups while improving with the number of objects, actions, and particularly with a lot of color overlaps between objects. Furthermore, multimodality strongly improves compositional generalization in settings where a pure vision model struggles to generalize. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1442
-
Elena 2024
THE ISSUES OF CREATION OF MACHINE-UNDERSTANDABLE SMART STANDARDS BASED ON KNOWLEDGE GRAPHS
The development of digital transformation requires the widespread use of digital technologies in standardization documents. One of the goals is to create standards with machine-understandable content that will allow the use of digital documents at various stages of development and production without the need for a human operator. The purpose of this work is to describe an approach for creating and translating industry normative documents into a machine-understandable representation for their further use in software services and systems. There are three types of SMART standard content: machine-readable, machine-interpretable, and machine-understandable. Knowledge graphs are actively used to formalize data and knowledge when solving various problems. The new two-level approach is proposed for the creation and translation into a machine-understandable representation of regulatory documents as knowledge graphs. The approach defines two types of interpretation of a smart document (human readability and machine understandability) through two related formats: a graph, each semantic node of which represents text in a natural language, and a network of concepts and strict connections. Each node of a human-readable graph corresponds (in general) to a subtree of a machine-readable knowledge graph. As the basis for ensuring the transformation of one form of smart standard representation into another form, LLM models are used, supplemented by a specialized adapter obtained as a result of additional training using the Parameter-Efficient Fine-Tuning approach. Requirements have been established for a set of problem- and subject-oriented tools for generating knowledge graphs. The conceptual architecture of the system for supporting the solution of a set of problems based on knowledge graphs is shown, and the principles for implementing software components that work with smart knowledge for intelligent software services are established. © 2024 St. Petersburg Federal Research Center of the Russian Academy of Sciences. All rights reserved. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3250
-
Ershov 2023
A Case Study for Compliance as Code with Graphs and Language Models: Public release of the Regulatory Knowledge Graph
arXiv 2023;(): 2023 Ref ID: 7642 The paper presents a study on using language models to automate the construction of executable Knowledge Graph (KG) for compliance. The paper focuses on Abu Dhabi Global Market regulations and taxonomy, involves manual tagging a portion of the regulations, training BERT-based models, which are then applied to the rest of the corpus. Coreference resolution and syntax analysis were used to parse the relationships between the tagged entities and to form KG stored in a Neo4j database. The paper states that the use of machine learning models released by regulators to automate the interpretation of rules is a vital step towards compliance automation, demonstrates the concept querying with Cypher, and states that the produced sub-graphs combined with Graph Neural Networks (GNN) will achieve expandability in judgment automation systems. The graph is open sourced on GitHub to provide structured data for future advancements in the field. |
Davis
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1273
-
Esmeir 2022
Entity Retrieval from Multilingual Knowledge Graphs
MRL 2022 - 2nd Workshop on Multi-Lingual Representation Learning, Proceedings of the Workshop 2022;():1-15 Association for Computational Linguistics (ACL) 2022 Ref ID: 5326 Knowledge Graphs (KGs) are structured databases that capture real-world entities and their relationships. The task of entity retrieval from a KG aims at retrieving a ranked list of entities relevant to a given user query. While English-only entity retrieval has attracted considerable attention, user queries, as well as the information contained in the KG, may be represented in multiple-and possibly distinct-languages. Furthermore, KG content may vary between languages due to different information sources and points of view. Recent advances in language representation have enabled natural ways of bridging gaps between languages. In this paper, we, therefore, propose to utilise language models (LMs) and diverse entity representations to enable truly multilingual entity retrieval. We propose two approaches: (i) an array of monolingual retrievers and (ii) a single multilingual retriever trained using queries and documents in multiple languages. We show that while our approach is on par with the significantly more complex state-of-the-art method for the English task, it can be successfully applied to virtually any language with an LM. Furthermore, it allows languages to benefit from one another, yielding significantly better performance, both for low- and high-resource languages. © 2022 Association for Computational Linguistics. |
yuexi
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#826
-
Ezzabady 2024
Towards Generating High-Quality Knowledge Graphs by Leveraging Large Language Models
29th International Conference on Applications of Natural Language to Information Systems (NLDB) 2024;14762():455-469 Univ Turin, Turin, ITALY Springer International Publishing Ag 2024 DOI: 10.1007/978-3-031-70239-6_31 · Ref ID: 3256 Knowledge graph creation requires relation extraction (RE) tools often trained on annotated data either manually or by distant supervision. Recent approaches operate at the model level to handle new domains with unseen relations, relying on transfer learning or generative approaches in few/zero-shot learning scenarios. In this paper, we adopt a different strategy by operating instead at the level of dataset creation. We, for the first time to the best of our knowledge, investigate the ability of prompt-based models to build high-quality RE datasets relying on GPT4 to extract triples from sentences. Our approach is further enhanced by linking our knowledge graph to Wikidata, a step that enriches our dataset and ensures its interoperability. This strategy has been successfully employed in two use cases: COVID and health relation extraction. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#678
-
Faghihi 2024
Prompt2DeModel: Declarative Neuro-Symbolic Modeling with Natural Language
18th International Conference on Neural-Symbolic Learning and Reasoning (NeSy) 2024;14980():315-327 Barcelona, SPAIN Springer International Publishing Ag 2024 DOI: 10.1007/978-3-031-71170-1_25 · Ref ID: 3719 This paper presents a conversational pipeline for crafting domain knowledge for complex neuro-symbolic models through natural language prompts. It leverages large language models to generate declarative programs in the DomiKnowS framework. The programs in this framework express concepts and their relationships as a graph in addition to logical constraints between them. The graph, later, can be connected to trainable neural models according to those specifications. Our proposed pipeline utilizes techniques like dynamic in-context demonstration retrieval, model refinement based on feedback from a symbolic parser, visualization, and user interaction to generate the tasks' structure and formal knowledge representation. This approach empowers domain experts, even those not well-versed in ML/AI, to formally declare their knowledge to be incorporated in customized neural models in the DomiKnowS framework. |
Davis
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#672
-
Fan 2024
Progressive Distillation Based on Masked Generation Feature Method for Knowledge Graph Completion
38th AAAI Conference on Artificial Intelligence (AAAI) / 36th Conference on Innovative Applications of Artificial Intelligence / 14th Symposium on Educational Advances in Artificial Intelligence 2024;():8380-8388 Vancouver, CANADA Assoc Advancement Artificial Intelligence 2024 Ref ID: 3284 In recent years, knowledge graph completion (KGC) models based on pre-trained language model (PLM) have shown promising results. However, the large number of parameters and high computational cost of PLM models pose challenges for their application in downstream tasks. This paper proposes a progressive distillation method based on masked generation features for KGC task, aiming to significantly reduce the complexity of pre-trained models. Specifically, we perform pre-distillation on PLM to obtain high-quality teacher models, and compress the PLM network to obtain multi-grade student models. However, traditional feature distillation suffers from the limitation of having a single representation of information in teacher models. To solve this problem, we propose masked generation of teacher-student features, which contain richer representation information. Furthermore, there is a significant gap in representation ability between teacher and student. Therefore, we design a progressive distillation method to distill student models at each grade level, enabling efficient knowledge transfer from teachers to students. The experimental results demonstrate that the model in the pre-distillation stage surpasses the existing state-of-the-art methods. Furthermore, in the progressive distillation stage, the model significantly reduces the model parameters while maintaining a certain level of performance. Specifically, the model parameters of the lower-grade student model are reduced by 56.7% compared to the baseline. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3479
-
Fan 2024
Graph Machine Learning in the Era of Large Language Models (LLMs)
arXiv 2024;(): 2024 Ref ID: 8252 Graphs play an important role in representing complex relationships in various domains like social networks, knowledge graphs, and molecular discovery. With the advent of deep learning, Graph Neural Networks (GNNs) have emerged as a cornerstone in Graph Machine Learning (Graph ML), facilitating the representation and processing of graph structures. Recently, LLMs have demonstrated unprecedented capabilities in language tasks and are widely adopted in a variety of applications such as computer vision and recommender systems. This remarkable success has also attracted interest in applying LLMs to the graph domain. Increasing efforts have been made to explore the potential of LLMs in advancing Graph ML's generalization, transferability, and few-shot learning ability. Meanwhile, graphs, especially knowledge graphs, are rich in reliable factual knowledge, which can be utilized to enhance the reasoning capabilities of LLMs and potentially alleviate their limitations such as hallucinations and the lack of explainability. Given the rapid progress of this research direction, a systematic review summarizing the latest advancements for Graph ML in the era of LLMs is necessary to provide an in-depth understanding to researchers and practitioners. Therefore, in this survey, we first review the recent developments in Graph ML. We then explore how LLMs can be utilized to enhance the quality of graph features, alleviate the reliance on labeled data, and address challenges such as graph heterogeneity and out-of-distribution (OOD) generalization. Afterward, we delve into how graphs can enhance LLMs, highlighting their abilities to enhance LLM pre-training and inference. Furthermore, we investigate various applications and discuss the potential future directions in this promising field. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3259
-
Fang 2023
CKBP v2: Better Annotation and Reasoning for Commonsense Knowledge Base Population
arXiv 2023;(): 2023 Ref ID: 7678 Commonsense Knowledge Bases (CSKB) Population, which aims at automatically expanding knowledge in CSKBs with external resources, is an important yet hard task in NLP. Fang et al. (2021a) proposed a CSKB Population (CKBP) framework with an evaluation set CKBP v1. However, CKBP v1 relies on crowdsourced annotations that suffer from a considerable number of mislabeled answers, and the evaluationset lacks alignment with the external knowledge source due to random sampling. In this paper, we introduce CKBP v2, a new high-quality CSKB Population evaluation set that addresses the two aforementioned issues by employing domain experts as annotators and incorporating diversified adversarial samples to make the evaluation data more representative. We show that CKBP v2 serves as a challenging and representative evaluation dataset for the CSKB Population task, while its development set aids in selecting a population model that leads to improved knowledge acquisition for downstream commonsense reasoning. A better population model can also help acquire more informative commonsense knowledge as additional supervision signals for both generative commonsense inference and zero-shot commonsense question answering. Specifically, the question-answering model based on DeBERTa-v3-large (He et al., 2023b) even outperforms powerful large language models in a zero-shot setting, including ChatGPT and GPT-3.5. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1012
-
Fang 2023
Automatic Knowledge Structuration of Automotive User Manual for Question Answering
Proceedings - 2023 4th International Conference on Computer, Big Data and Artificial Intelligence, ICCBD+AI 2023 2023;():184-190 Institute of Electrical and Electronics Engineers Inc. 2023 DOI: 10.1109/ICCBD-AI62252.2023.00038 · Ref ID: 4971 Automotive user manuals serve as repositories of valuable information pertaining to a vehicle, leveraging question answering (QA) systems provides users with a convenient means to access this knowledge. In pursuit of developing an efficient QA system for such documents, this paper proposes the organization of the content into a structured knowledge graph-like triplet format.After conducting a comprehensive analysis of the automotive user manual content, we introduce a <subject, function, content> (<s, f, c>) triplet knowledge representation to represent the knowledge. Our approach involves a three-step pipeline for extracting these triplets from semi-structured XML documents. Central to this structure is the "content"node, forming the core of knowledge items. Leveraging the in-context learning abilities of an off-the-shelf Large Language Model (LLM), specifically ChatGPT, the "subject"and "function"components are induced from the "content"node. To ensure compactness and coherence in knowledge representation, a tailored phrase normalization process is designed to select identical phrases.Additionally, a LLM-powered evaluation method is employed to validate the extracted triplets, affirming their accuracy and relevance. This methodology demonstrates the effectiveness of our proposed approach in automating the structuration of knowledge within automotive user manuals for seamless QA. © 2023 IEEE. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1160
-
Fang 2022
Data-Efficient Concept Extraction from Pre-trained Language Models for Commonsense Explanation Generation
Findings of the Association for Computational Linguistics: EMNLP 2022 2022;():5912-5922 Association for Computational Linguistics (ACL) 2022 Ref ID: 5404 Predicting the key explanation concept is essential for generating commonsense explanations. This paper introduces a method to predict the concept from pre-trained language models for commonsense explanation generation. Our experiment found that adopting a language model as the concept extractor and fine-tuning it with 20% training data can improve the quality and accuracy of the generated explanations over multiple evaluation metrics. Compared with conventional methods that search concepts over knowledge graphs, our method does not require the preparation and training models to search through knowledge graphs. To better understand the results from pre-trained language models, we also designed a metric to evaluate the retrieved concepts. Through analysis and experiments, we show the correlation between this metric and the performance of the generators, and we also show the importance of attaching concepts for generating high-quality sentences. © 2022 Association for Computational Linguistics. |
Xinchen
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#762
-
Färber 2023
SemOpenAlex: The Scientific Landscape in 26 Billion RDF Triples
22nd International Semantic Web Conference (ISWC) 2023;14266():94-112 Athens, GREECE Springer International Publishing Ag 2023 DOI: 10.1007/978-3-031-47243-5_6 · Ref ID: 3289 We present SemOpenAlex, an extensive RDF knowledge graph that contains over 26 billion triples about scientific publications and their associated entities, such as authors, institutions, journals, and concepts. SemOpenAlex is licensed under CC0, providing free and open access to the data. We offer the data through multiple channels, including RDF dump files, a SPARQL endpoint, and as a data source in the Linked Open Data cloud, complete with resolvable URIs and links to other data sources. Moreover, we provide embeddings for knowledge graph entities using high-performance computing. SemOpenAlex enables a broad range of use-case scenarios, such as exploratory semantic search via our website, large-scale scientific impact quantification, and other forms of scholarly big data analytics within and across scientific disciplines. Additionally, it enables academic recommender systems, such as recommending collaborators, publications, and venues, including explainability capabilities. Finally, SemOpenAlex can serve for RDF query optimization benchmarks, creating scholarly knowledge-guided language models, and as a hub for semantic scientific publishing. Data and Services: https://semopenalex.org https://w3id.org/SemOpenAlex Code: https://github.com/metaphacts/semopenalex/ Data License: Creative Commons Zero (CC0) Code License: MIT License |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1124
-
Feng 2024
Construction and Application of Knowledge Graph for Water Engineering Scheduling Based on Large Language Model
J. Frontier. Comput. Sci. Technol. 2024;18(6):1637-1647 2024 DOI: 10.3778/j.issn.1673-9418.2311098 · Ref ID: 4567 With the growth of water conservancy and the increasing demand for information, handling and representing large volumes of water-related data has become complex. Particularly, scheduling textual data often exists in natural language form, lacking clear structure and standardization. Processing and utilizing such diverse data necessitates extensive domain knowledge and professional expertise. To tackle this challenge, a method based on large language model has been proposed to construct a knowledge graph for water engineering scheduling. This approach involves collecting and preprocessing scheduling rule data at the data layer, leveraging large language models to extract embedded knowledge, constructing the ontology at the conceptual layer, and extracting the“three- step”method prompt strategy at the instance layer. Under the interaction of the data, conceptual, and instance layers, high-performance extraction of rule texts is achieved, and the construction of the dataset and knowledge graph is completed. Experimental results show that the F1 value of the extraction method in this paper reaches 85.5%, and the effectiveness and rationality of the modules of the large language model are validated through ablation experiments. This graph integrates dispersed water conservancy rule information, effectively handles unstructured textual data, and offers visualization querying and functionality tracing. It aids professionals in assessing water conditions and selecting appropriate scheduling schemes, providing valuable support for conservancy decision-making and intelligent reasoning. © 2024 Journal of Computer Engineering and Applications Beijing Co., Ltd.; Science Press. All rights reserved. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1548
-
Feng 2024
Label design method for flood control scheduling rules assisted by LLM
The information extraction of flood control dispatching rules is of great significance for flood control dispatching automation, and the design of labeling systems is pivotal for information extraction. Traditional designs often have comprehension biases and omissions, leading to issues like overgeneralization and incompleteness. Addressing these imperfections, this research emphasizes the extraction of rules in flood scheduling texts, proposing an enhanced approach for labeling optimization.Large Language Models(LLM) are utilized for tasks like label refinement and generation, boosting label precision and clarity, and a technique for extracting entity relationship triplets is also presented for datasets with many labels. Grouping these triplets enhances extraction performance in label-rich datasets. A visual knowledge graph for flood control scheduling using Neo4j is also developed. This research offers foundational insights for future work in flood control scheduling knowledge extraction. © 2024 International Research and Training Center on Erosion and Sedimentation and China Water and Power Press. All rights reserved. |
Kwesi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3718
-
Feng 2024
Monitoring Latent World States in Language Models with Propositional Probes
arXiv 2024;(): 2024 Ref ID: 8431 Language models are susceptible to bias, sycophancy, backdoors, and other tendencies that lead to unfaithful responses to the input context. Interpreting internal states of language models could help monitor and correct unfaithful behavior. We hypothesize that language models represent their input contexts in a latent world model, and seek to extract this latent world state from the activations. We do so with 'propositional probes', which compositionally probe tokens for lexical information and bind them into logical propositions representing the world state. For example, given the input context ''Greg is a nurse. Laura is a physicist.'', we decode the propositions ''WorksAs(Greg, nurse)'' and ''WorksAs(Laura, physicist)'' from the model's activations. Key to this is identifying a 'binding subspace' in which bound tokens have high similarity (''Greg'' and ''nurse'') but unbound ones do not (''Greg'' and ''physicist''). We validate propositional probes in a closed-world setting with finitely many predicates and properties. Despite being trained on simple templated contexts, propositional probes generalize to contexts rewritten as short stories and translated to Spanish. Moreover, we find that in three settings where language models respond unfaithfully to the input context – prompt injections, backdoor attacks, and gender bias – the decoded propositions remain faithful. This suggests that language models often encode a faithful world model but decode it unfaithfully, which motivates the search for better interpretability tools for monitoring LMs. |
yuexi
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3979
-
Feng 2024
VitaGlyph: Vitalizing Artistic Typography with Flexible Dual-branch Diffusion Models
arXiv 2024;(): 2024 Ref ID: 8650 Artistic typography is a technique to visualize the meaning of input character in an imaginable and readable manner. With powerful text-to-image diffusion models, existing methods directly design the overall geometry and texture of input character, making it challenging to ensure both creativity and legibility. In this paper, we introduce a dual-branch and training-free method, namely VitaGlyph, enabling flexible artistic typography along with controllable geometry change to maintain the readability. The key insight of VitaGlyph is to treat input character as a scene composed of Subject and Surrounding, followed by rendering them under varying degrees of geometry transformation. The subject flexibly expresses the essential concept of input character, while the surrounding enriches relevant background without altering the shape. Specifically, we implement VitaGlyph through a three-phase framework: (i) Knowledge Acquisition leverages large language models to design text descriptions of subject and surrounding. (ii) Regional decomposition detects the part that most matches the subject description and divides input glyph image into subject and surrounding regions. (iii) Typography Stylization firstly refines the structure of subject region via Semantic Typography, and then separately renders the textures of Subject and Surrounding regions through Controllable Compositional Generation. Experimental results demonstrate that VitaGlyph not only achieves better artistry and readability, but also manages to depict multiple customize concepts, facilitating more creative and pleasing artistic typography generation. Our code will be made publicly at https://github.com/Carlofkl/VitaGlyph. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#404
-
Feng 2023
KALM: Knowledge-Aware Integration of Local, Document, and Global Contexts for Long Document Understanding
61st Annual Meeting of the the Association-for-Computational-Linguistics (ACL) 2023;():2116-2138 Toronto, CANADA Assoc Computational Linguistics-Acl 2023 Ref ID: 3137 With the advent of pretrained language models (LMs), increasing research efforts have been focusing on infusing commonsense and domain-specific knowledge to prepare LMs for downstream tasks. These works attempt to leverage knowledge graphs, the de facto standard of symbolic knowledge representation, along with pretrained LMs. While existing approaches have leveraged external knowledge, it remains an open question how to jointly incorporate knowledge graphs representing varying contexts-from local (e.g., sentence), to document-level, to global knowledge-to enable knowledge-rich exchange across these contexts. Such rich contextualization can be especially beneficial for long document understanding tasks since standard pretrained LMs are typically bounded by the input sequence length. In light of these challenges, we propose KALM, a Knowledge-Aware Language Model that jointly leverages knowledge in local, document-level, and global contexts for long document understanding. KALM first encodes long documents and knowledge graphs into the three knowledge-aware context representations. It then processes each context with context-specific layers, followed by a "context fusion" layer that facilitates knowledge exchange to derive an overarching document representation. Extensive experiments demonstrate that KALM achieves state-of-the-art performance on six long document understanding tasks and datasets. Further analyses reveal that the three knowledge-aware contexts are complementary and they all contribute to model performance, while the importance and information exchange patterns of different contexts vary with respect to different tasks and datasets. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1127
-
Feng 2024
Construction of Gem Knowledge Graph Based on Large Language Model
The sources of gemmological knowledge include books, journals, courses, markets and related disciplines. A complete gemmological knowledge system is of great significance to the jewelry industry. Gem knowledge points are numerous and relatively isolated in storage, which is not conducive to practitioners and researchers to retrieve knowledge. This problem can be solved by constructing a gem knowledge base system. The graph can deal with the complex association between knowledge points, which is impossible for widely used structured database, therefore, a knowledge base in the form of a knowledge graph is selected. This paper introduces the traditional knowledge graph construction method and points out the difficulties: high cost, heavy workload, difficult technology and slightly low accuracy. It is proposed to use LLM (Large language model) to complete some tasks in knowledge graph construction to improve the cost and workload. A new knowledge graph construction idea based on LLM is conceived. Its steps include data cleaning, knowledge acquisition and knowledge refinement. According to the above ideas, a gemstone knowledge graph that can cover the gemstone knowledge of the bachelor stage is constructed, and some query scenarios are displayed. The feasibility and high efficiency of the new method are proved by our test e-valuation, and the possible application direction of the graph is prospected. © 2024 Editorial Department of Journal of Gems and Gemmology, China University of Geosciences. All rights reserved. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1211
-
Feng 2024
Don't Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;1():14664-14690 Association for Computational Linguistics (ACL) 2024 Ref ID: 4293 Despite efforts to expand the knowledge of large language models (LLMs), knowledge gaps-missing or outdated information in LLMs-might always persist given the evolving nature of knowledge. In this work, we study approaches to identify LLM knowledge gaps and abstain from answering questions when knowledge gaps are present. We first adapt existing approaches to model calibration or adaptation through fine-tuning/prompting and analyze their ability to abstain from generating low-confidence outputs. Motivated by their failures in self-reflection and over-reliance on held-out sets, we propose two novel approaches that are based on model collaboration, i.e., LLMs probing other LLMs for knowledge gaps, either cooperatively or competitively. Extensive experiments with three LLMs on four QA tasks featuring diverse knowledge domains demonstrate that both cooperative and competitive approaches to unveiling LLM knowledge gaps achieve up to 19.3% improvements on abstain accuracy against the strongest baseline. Further analysis reveals that our abstention methods pinpoint failure cases in retrieval augmentation and knowledge gaps in multi-hop reasoning. © 2024 Association for Computational Linguistics. |
Xinchen
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3906
-
Feng 2024
Teaching LLMs to Abstain across Languages via Multilingual Feedback
arXiv 2024;(): 2024 Ref ID: 8414 Multilingual LLMs often have knowledge disparities across languages, with larger gaps in under-resourced languages. Teaching LLMs to abstain in the face of knowledge gaps is thus a promising strategy to mitigate hallucinations in multilingual settings. However, previous studies on LLM abstention primarily focus on English; we find that directly applying existing solutions beyond English results in up to 20.5% performance gaps between high and low-resource languages, potentially due to LLMs' drop in calibration and reasoning beyond a few resource-rich languages. To this end, we propose strategies to enhance LLM abstention by learning from multilingual feedback, where LLMs self-reflect on proposed answers in one language by generating multiple feedback items in related languages: we show that this helps identifying the knowledge gaps across diverse languages, cultures, and communities. Extensive experiments demonstrate that our multilingual feedback approach outperforms various strong baselines, achieving up to 9.2% improvement for low-resource languages across three black-box and open models on three datasets, featuring open-book, closed-book, and commonsense QA. Further analysis reveals that multilingual feedback is both an effective and a more equitable abstain strategy to serve diverse language speakers, and cultural factors have great impact on language selection and LLM abstention behavior, highlighting future directions for multilingual and multi-cultural reliable language modeling. |
Xinchen
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#170
-
Feng 2023
Detecting contradictions from IoT protocol specification documents based on neural generated knowledge graph
Due to the boom of Internet of Things (IoT) in recent years, various IoT devices are connected to the Internet and communicate with each other through network protocols such as the Constrained Application Protocol (CoAP). These protocols are typically defined and described in specification documents, such as Request for Comments (RFC), which are written in natural or semi-formal languages. Since developers largely follow the specification documents when implementing web protocols, they have become the de facto protocol specifications. Therefore, it must be ensured that the descriptions in them are consistent to avoid technological issues, incompatibility, security risks, or even legal concerns. In this work, we propose Neural RFC Knowledge Graph (NRFCKG), a neural network-generated knowledge graph based contradictions detection tool for IoT protocol specification documents. Our approach can automatically parse the specification documents and construct knowledge graphs from them through entity extraction, relation extraction, and rule extraction with large language models. It then conducts an intra-entity and inter-entity contradiction detection over the generated knowledge graph. We implement NRFCKG and apply it to the most extensively used messaging protocols in IoT, including the main RFC (RFC7252) of CoAP, the specification document of MQTT, and the specification document of AMQP. Our evaluation shows that NRFCKG generalizes well to other specification documents and it manages to detect contradictions from these IoT protocol specification documents.(c) 2023 ISA. Published by Elsevier Ltd. All rights reserved. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#605
-
Fernando 2020
Neural memory plasticity for medical anomaly detection
In the domain of machine learning, Neural Memory Networks (NMNs) have recently achieved impressive results in a variety of application areas including visual question answering, trajectory prediction, object tracking, and language modelling. However, we observe that the attention based knowledge retrieval mechanisms used in current NMNs restrict them from achieving their full potential as the attention process retrieves information based on a set of static connection weights. This is suboptimal in a setting where there are vast differences among samples in the data domain; such as anomaly detection where there is no consistent criteria for what constitutes an anomaly. In this paper, we propose a plastic neural memory access mechanism which exploits both static and dynamic connection weights in the memory read, write and output generation procedures. We demonstrate the effectiveness and flexibility of the proposed memory model in three challenging anomaly detection tasks in the medical domain: abnormal EEG identification, MRI tumour type classification and schizophrenia risk detection in children. In all settings, the proposed approach outperforms the current state-of-the-art. Furthermore, we perform an in-depth analysis demonstrating the utility of neural plasticity for the knowledge retrieval process and provide evidence on how the proposed memory model generates sparse yet informative memory outputs. (C) 2020 Elsevier Ltd. All rights reserved. |
Kwesi
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3528
-
Forer 2024
Inferring Scientific Cross-Document Coreference and Hierarchy with Definition-Augmented Relational Reasoning
arXiv 2024;(): 2024 Ref ID: 8622 We address the fundamental task of inferring cross-document coreference and hierarchy in scientific texts, which has important applications in knowledge graph construction, search, recommendation and discovery. LLMs can struggle when faced with many long-tail technical concepts with nuanced variations. We present a novel method which generates context-dependent definitions of concept mentions by retrieving full-text literature, and uses the definitions to enhance detection of cross-document relations. We further generate relational definitions, which describe how two concept mentions are related or different, and design an efficient re-ranking approach to address the combinatorial explosion involved in inferring links across papers. In both fine-tuning and in-context learning settings we achieve large gains in performance. We provide analysis of generated definitions, shedding light on the relational reasoning ability of LLMs over fine-grained scientific concepts. |
Mike
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2797
-
Francis 2017
Poster Abstract: Context Intelligence in Pervasive Environments
2017 IEEE/ACM Second International Conference on Internet-of-Things Design and Implementation (IoTDI) 2017;():315-316 2017 Ref ID: 6443 Intelligent personalization systems are becoming increasingly reliant on contextually-relevant devices and services, such as those available within modern IoT deployments. An IoT context may emerge-or become pervasive-when the intelligent system generates knowledge from dialogue-based interactions with the end-user; the context is strengthened even further by incorporating state representations about the environment (e.g., generated from wireless sensor data) into the knowledge graph. This is crucial for pervasive applications like digital assistance in IoT, where context-aware systems need to adapt quickly: activities like leaving work home-bound, driving to the grocery store, arriving at home, and walking the dog, for example, can occur in a relatively short period of time-during which an intelligent assistant must be able to support user requests in a consistent and coherent manner. Given that computational ontologies can serve as semantic models for heterogeneous data, they are becoming increasingly viable for reasoning across different IoT contexts. This involves: (a) federation and dynamic pruning of multiple modular ontologies, ideally, to comprehensively capture only the knowledge that will facilitate execution of a multi-context task; (b) fast consistency-checking and ontology-based inferences, aided by rules-based execution environments that can evaluate/transform ambient wireless sensor network (WSN) data, in real-time; and (c) run-time execution of ontology-based control procedures, through rule-engine actuation commands sent across the WSN. Only by realizing these functionalities may intelligent systems be capable of reasoning over device properties, system states, and user activities, while appropriately delegating commands to other intelligent agents or other relevant IoT services. In this poster, we illustrate how a multi-context knowledge base can be structured on the basis of modular ontologies and integrated with a distributed rules-based inference engine in multiple smart-building environments, in order to enable scalable contextual reasoning for intelligent assistance. Preliminary results are also discussed. This work is conducted through the partnership of Bosch Research Pittsburgh and Carnegie Mellon University (CMU), and is in partial satisfaction of CMU's Bosch Energy Research Network (BERN) grant, awarded for developments in intelligent building solutions. The approach we describe is also partially based on the Ubiquitous Personal Assistant (UPA) project, Bosch Research's largest research initiative worldwide. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1808
-
Fu 2024
Research on KG and LLM knowledge-enhanced pediatric diseases intelligent diagnosis
Proceedings of SPIE - The International Society for Optical Engineering 2024;13171(): SPIE 2024 DOI: 10.1117/12.3032061 · Ref ID: 4534 Pediatric diseases are challenging to diagnose due to their complex and diverse characteristics. To assist doctors in diagnosis and help them make informed decisions, this paper proposes a Knowledge graph and Large language model Knowledge-Enhanced (KLKE) intelligent diagnosis model. The intelligent diagnosis task is treated as a text classification task, where the original Electronic Medical Record are input into MacBERT model encoder to obtain the contextual representation after key information enhancement and KG prompted LLM enhancement respectively. The final text representation is obtained by concatenating and merging the enhanced representations. Graph Convolutional Network is utilized to obtain the knowledge representation and the two representations are fused using a fusion method based on interactive attention mechanism. Experiments are conducted on PeEMR, and compared with models that only fuses triples and graph structures. The KLKE achieved an increase of 9.15% and 2.28% in F1_micro scores respectively. © 2024 SPIE. |
Kwesi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#889
-
Fukuda 2024
Zero-Shot Query Experiments in Knowledge Graph Reasoning Challenge for Older Adults Safety
18th IEEE International Conference on Semantic Computing (ICSC) 2024;():301-305 Laguna Hills, CA Ieee Computer Soc 2024 DOI: 10.1109/icsc59802.2024.00054 · Ref ID: 3149 The 2nd International Knowledge Graph Reasoning Challenge involves social issues focusing on the safety of older adults in their homes. The challenge aims to extract statistical information related to actions and objects that pose risks to daily life. To answer each question in a video, we used Video-LLaVa, a large-scale visual language model (LVLM), using two approaches. The first approach involves inputting question text and video into Video-LLaVa. In this paper, we describe the results of zero-shot queries. The second approach is to obtain a detailed description of the video output using Video-LLaVa and then answer questions based on it. We have yet to achieve good results with these approaches, but we have identified some issues that we will discuss along with the results. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1179
-
Furumai 2023
Detecting Dialogue Hallucination Using Graph Neural Networks
Proceedings - 22nd IEEE International Conference on Machine Learning and Applications, ICMLA 2023 2023;():871-877 Institute of Electrical and Electronics Engineers Inc. 2023 DOI: 10.1109/ICMLA58977.2023.00128 · Ref ID: 4945 Even though large language models (LLMs) accumulate tremendous knowledge, dialogue systems built with LLMs induce hallucinations, leading to the generation of non-factual responses. How to provide proper references to achieve interpretable hallucination detection is a key issue that needs to be addressed. In this paper, we propose a graph neural network (GNN)-based method to achieve high-performance and interpretable hallucination detection for domain-specific dialogue systems. The method involves performing graph matching between a reference knowledge graph obtained from a knowledge database and a response knowledge graph extracted from the response to detect non-factual responses. By comparing with strong baselines, our method achieves a recall improvement of up to 11% and infers the cause of hallucinations with a probability of over 79%. © 2023 IEEE. |
yuexi
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2765
-
Gallo 2007
An Ontology for the Quality of Experience framework
2007 IEEE International Conference on Systems, Man and Cybernetics 2007;():1540-1544 2007 DOI: 10.1109/ICSMC.2007.4414109 · Ref ID: 6668 Agents need a formal representation of knowledge. This is modelled in an ontology. We present a survey of ontologies in the area of QoS management. From the survey we identified that improvements can be done in the Ontology of the Quality of Experience framework. We believe that with the extension full QoS Management capabilities can be then supported in the context of Quality of Experience. We focus in the appropriate QoS mechanism selection, network monitoring and QoS adaptation. With the additional concepts and actions the QoE ontology meets three key requirements for a QoS ontology: (i) decide which QoS mechanisms is better to fits the user needs; (ii) perform QoS monitoring and detection of SLA violations; and (iii) carry out QoS adaptation. Two experimental scenarios are currently being conducted. In scenario 1 the original ontology is used whilst for scenario 2 the extended version is employed. An initial comparative analysis is performed. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3696
-
Gan 2023
Making Large Language Models Better Knowledge Miners for Online Marketing with Progressive Prompting Augmentation
arXiv 2023;(): 2023 Ref ID: 7978 Nowadays, the rapid development of mobile economy has promoted the flourishing of online marketing campaigns, whose success greatly hinges on the efficient matching between user preferences and desired marketing campaigns where a well-established Marketing-oriented Knowledge Graph (dubbed as MoKG) could serve as the critical "bridge" for preference propagation. In this paper, we seek to carefully prompt a Large Language Model (LLM) with domain-level knowledge as a better marketing-oriented knowledge miner for marketing-oriented knowledge graph construction, which is however non-trivial, suffering from several inevitable issues in real-world marketing scenarios, i.e., uncontrollable relation generation of LLMs,insufficient prompting ability of a single prompt, the unaffordable deployment cost of LLMs. To this end, we propose PAIR, a novel Progressive prompting Augmented mIning fRamework for harvesting marketing-oriented knowledge graph with LLMs. In particular, we reduce the pure relation generation to an LLM based adaptive relation filtering process through the knowledge-empowered prompting technique. Next, we steer LLMs for entity expansion with progressive prompting augmentation,followed by a reliable aggregation with comprehensive consideration of both self-consistency and semantic relatedness. In terms of online serving, we specialize in a small and white-box PAIR (i.e.,LightPAIR),which is fine-tuned with a high-quality corpus provided by a strong teacher-LLM. Extensive experiments and practical applications in audience targeting verify the effectiveness of the proposed (Light)PAIR. |
Davis
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#162
-
Gao 2024
Deep Learning-Based Fault Knowledge Graph Construction for Power Communication Networks
6th Asia Energy and Electrical Engineering Symposium (AEEES) 2024;():1088-1093 Univ Elect Sci & Technol China, Sch Mech & Elect Engn, Chengdu, PEOPLES R CHINA Ieee 2024 DOI: 10.1109/aeees61147.2024.10544941 · Ref ID: 3028 Power communication network is a crucial infrastructure in the model power system, and its maintenance capability are crucial to ensuring the stable operation of power grid business. As an organized semantic knowledge base, the knowledge graph effectively organizes power communication network fault documentation and expert experience to enhance intelligent maintenance. This paper outlines a top-down approach to systematically construct a fault knowledge graph in the domain of power communication networks. The approach utilizes a seven-step method to establish a domain ontology model and integrates deep learning algorithms, including pre-trained language models, bidirectional long short time memory networks, convolutional neural networks and attention mechanisms. These algorithms process unstructured text to extract key entities and relationships. The effectiveness of the approach is verified through experiments using a product device document as a test case. Extracted knowledge is then visualized and stored using Neo4j database. Finally, this paper proposes a knowledge service model centered on fault knowledge graph and explores its application in fault diagnosis. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3576
-
Gao 2022
KMIR: A Benchmark for Evaluating Knowledge Memorization, Identification and Reasoning Abilities of Language Models
arXiv 2022;(): 2022 Ref ID: 7522 Previous works show the great potential of pre-trained language models (PLMs) for storing a large amount of factual knowledge. However, to figure out whether PLMs can be reliable knowledge sources and used as alternative knowledge bases (KBs), we need to further explore some critical features of PLMs. Firstly, knowledge memorization and identification abilities: traditional KBs can store various types of entities and relationships; do PLMs have a high knowledge capacity to store different types of knowledge? Secondly, reasoning ability: a qualified knowledge source should not only provide a collection of facts, but support a symbolic reasoner. Can PLMs derive new knowledge based on the correlations between facts? To evaluate these features of PLMs, we propose a benchmark, named Knowledge Memorization, Identification, and Reasoning test (KMIR). KMIR covers 3 types of knowledge, including general knowledge, domain-specific knowledge, and commonsense, and provides 184,348 well-designed questions. Preliminary experiments with various representative pre-training language models on KMIR reveal many interesting phenomenons: 1) The memorization ability of PLMs depends more on the number of parameters than training schemes. 2) Current PLMs are struggling to robustly remember the facts. 3) Model compression technology retains the amount of knowledge well, but hurts the identification and reasoning abilities. We hope KMIR can facilitate the design of PLMs as better knowledge sources. |
Mike
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1350
-
Gao 2024
Generative News Recommendation
WWW 2024 - Proceedings of the ACM Web Conference 2024;():3444-3453 Association for Computing Machinery, Inc 2024 DOI: 10.1145/3589334.3645448 · Ref ID: 4060 Most existing news recommendation methods tackle this task by conducting semantic matching between candidate news and user representation produced by historical clicked news. However, they overlook the high-level connections among different news articles and also ignore the profound relationship between these news articles and users. And the definition of these methods dictates that they can only deliver news articles as-is. On the contrary, integrating several relevant news articles into a coherent narrative would assist users in gaining a quicker and more comprehensive understanding of events. In this paper, we propose a novel generative news recommendation paradigm that includes two steps: (1) Leveraging the internal knowledge and reasoning capabilities of the Large Language Model (LLM) to perform high-level matching between candidate news and user representation; (2) Generating a coherent and logically structured narrative based on the associations between related news and user interests, thus engaging users in further reading of the news. Specifically, we propose GNR to implement the generative news recommendation paradigm. First, we compose the dual-level representation of news and users by leveraging LLM to generate theme-level representations and combine them with semantic-level representations. Next, in order to generate a coherent narrative, we explore the news relation and filter the related news according to the user preference. Finally, we propose a novel training method named UIFT to train the LLM to fuse multiple news articles in a coherent narrative. Extensive experiments show that GNR can improve recommendation accuracy and eventually generate more personalized and factually consistent narratives. © 2024 Owner/Author. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#421
-
Gao 2024
Knowledge Enhanced Vision and Language Model for Multi-Modal Fake News Detection
The rapid dissemination of fake news and rumors through the Internet and social media platforms poses significant challenges and raises concerns in the public sphere. Automatic detection of fake news plays a crucial role in mitigating the spread of misinformation. While recent approaches have focused on leveraging neural networks to improve textual and visual representations in multi-modal fake news analysis, they often overlook the potential of incorporating knowledge information to verify facts within news articles. In this paper, we present a vision and language model that incorporates knowledge to enhance multi-modal fake news detection. Our proposed model integrates information from large scale open knowledge graphs to augment its ability to discern the veracity of news content. Unlike previous methods that utilize separate models to extract textual and visual features, we synthesize a unified model capable of extracting both types of features simultaneously. To represent news articles, we introduce a graph structure where nodes encompass entities, relationships extracted from the textual content, and objects depicted in associated images. By utilizing the knowledge graph, we establish meaningful relationships between nodes within the news articles. Experimental evaluations on a real-world multi-modal dataset from Twitter demonstrate significant performance improvement by incorporating knowledge information. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3650
-
Gao 2023
Leveraging A Medical Knowledge Graph into Large Language Models for Diagnosis Prediction
arXiv 2023;(): 2023 Ref ID: 7820 Electronic Health Records (EHRs) and routine documentation practices play a vital role in patients' daily care, providing a holistic record of health, diagnoses, and treatment. However, complex and verbose EHR narratives overload healthcare providers, risking diagnostic inaccuracies. While Large Language Models (LLMs) have showcased their potential in diverse language tasks, their application in the healthcare arena needs to ensure the minimization of diagnostic errors and the prevention of patient harm. In this paper, we outline an innovative approach for augmenting the proficiency of LLMs in the realm of automated diagnosis generation, achieved through the incorporation of a medical knowledge graph (KG) and a novel graph model: Dr.Knows, inspired by the clinical diagnostic reasoning process. We derive the KG from the National Library of Medicine's Unified Medical Language System (UMLS), a robust repository of biomedical knowledge. Our method negates the need for pre-training and instead leverages the KG as an auxiliary instrument aiding in the interpretation and summarization of complex medical concepts. Using real-world hospital datasets, our experimental results demonstrate that the proposed approach of combining LLMs with KG has the potential to improve the accuracy of automated diagnosis generation. More importantly, our approach offers an explainable diagnostic pathway, edging us closer to the realization of AI-augmented diagnostic decision support systems. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1077
-
Garbas 2024
Choose Your Transformer: Improved Transferability Estimation of Transformer Models on Classification Tasks
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():12752-12768 Association for Computational Linguistics (ACL) 2024 Ref ID: 4303 There currently exists a multitude of pre-trained transformer language models (LMs) that are readily available. From a practical perspective, this raises the question of which pre-trained LM will perform best if fine-tuned for a specific downstream NLP task. However, exhaustively fine-tuning all available LMs to determine the best-fitting model is computationally infeasible. To address this problem, we present an approach that inexpensively estimates a ranking of the expected performance of a given set of candidate LMs for a given task. Following a layer-wise representation analysis, we extend existing approaches such as H-score and LogME by aggregating representations across all layers of the transformer model. We present an extensive analysis of 20 transformer LMs, 6 downstream NLP tasks, and various estimators (linear probing, kNN, H-score, and LogME). Our evaluation finds that averaging the layer representations significantly improves the Pearson correlation coefficient between the true model ranks and the estimate, increasing from 0.58 to 0.86 for LogME and from 0.65 to 0.88 for H-score. © 2024 Association for Computational Linguistics. |
Srividya
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2515
-
García-García 2021
gPROFIT: A Tool to Assist the Automatic Extraction of Business Knowledge From Legacy Information Systems
Business digitization is a crucial strategy for business growth in the 21st century. Its benefits include improving business process automation, customer satisfaction, productivity, decision-making, turnover, and adaptation to market changes. However, digitization is not a trivial task. As a major paradigm and mindset shift, it involves a lot of effort within an organization and therefore requires commitment from employees and managers. This is especially critical in companies whose business processes are mostly reliant on legacy information systems (LIS), which are usually specialized and based on technological architectures that could be considered obsolete. The replacement of these systems by more recent, process-oriented technologies, the building up of employees' know-how and the continued use of outdated documentation are difficult, expensive tasks that hinder the initiation of continuous improvement processes in companies. This paper proposes techniques for finding and extracting process models from legacy databases. Specifically, it ( i) lays the theoretical foundations of a model-driven framework for systematically extracting business process models (conform to standard BPMN notation) from LIS considering process time perspective, and (ii) proposes a technological tool called gPROFIT, which uses machine learning techniques to support that theoretical framework, facilitate its use in real environments and extract the business knowledge embedded in such legacy systems. The paper also presents proofs-of-concept showing how our proposal has been validated in several legacy systems. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#547
-
Garrido-Muñoz 2023
MarIA and BETO are sexist: evaluating gender bias in large language models for Spanish
The study of bias in language models is a growing area of work, however, both research and resources are focused on English. In this paper, we make a first approach focusing on gender bias in some freely available Spanish language models trained using popular deep neural networks, like BERT or RoBERTa. Some of these models are known for achieving state-of-the-art results on downstream tasks. These promising results have promoted such models' integration in many real-world applications and production environments, which could be detrimental to people affected for those systems. This work proposes an evaluation framework to identify gender bias in masked language models, with explainability in mind to ease the interpretation of the evaluation results. We have evaluated 20 different models for Spanish, including some of the most popular pretrained ones in the research community. Our findings state that varying levels of gender bias are present across these models.This approach compares the adjectives proposed by the model for a set of templates. We classify the given adjectives into understandable categories and compute two new metrics from model predictions, one based on the internal state (probability) and the other one on the external state (rank). Those metrics are used to reveal biased models according to the given categories and quantify the degree of bias of the models under study. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3986
-
Ge 2024
What Do the Circuits Mean? A Knowledge Edit View
arXiv 2024;(): 2024 Ref ID: 8421 In the field of language model interpretability, circuit discovery is gaining popularity. Despite this, the true meaning of these circuits remains largely unanswered. We introduce a novel method to learn their meanings as a holistic object through the lens of knowledge editing. We extract circuits in the GPT-2 base model for classification tasks related to syntax and model safety, and study their knowledge property via a model edit dataset containing hierarchical entities. We find that these circuits contain entity knowledge but resist new knowledge, demonstrating a "confirmation bias" behavior. Additionally, we examine the impact of circuit size, discovering that an ideal "theoretical circuit" where essential knowledge is concentrated likely incorporates more than 5% but less than 50% of the model's parameters. We also assess the overlap between circuits from different datasets, finding moderate similarities. We proceed with analyzing the modular components of the circuits, finding that up to 60% of the circuits consist of layer normalization modules rather than attention or MLP modules, adding evidence to the ongoing debates regarding knowledge localization. In summary, our findings offer novel insights into the meanings of the circuits, and introduce directions for further interpretability and safety research of language models. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#435
-
Ge 2024
Knowledge Graph Embedding: An Overview
Many mathematical models have been leveraged to design embeddings for representing Knowledge Graph (KG) entities and relations for link prediction and many downstream tasks. These mathematically-inspired models are not only highly scalable for inference in large KGs, but also have many explainable advantages in modeling different relation patterns that can be validated through both formal proofs and empirical results. In this paper, we make a comprehensive overview of the current state of research in KG completion. In particular, we focus on two main branches of KG embedding (KGE) design: 1) distance-based methods and 2) semantic matching-based methods. We discover the connections between recently proposed models and present an underlying trend that might help researchers invent novel and more effective models. Next, we delve into CompoundE and CompoundE3D, which draw inspiration from 2D and 3D affine operations, respectively. They encompass a broad spectrum of distance-based embedding techniques. We will also discuss an emerging approach for KG completion which leverages pre-trained language models (PLMs) and textual descriptions of entities and relations and offer insights into the integration of KGE embedding methods with PLMs for KG completion. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2390
-
Ge 2024
Enhancing Pre-Trained Language Models with Knowledge Representation Using Line Graphs
2024 3rd International Conference on Artificial Intelligence and Computer Information Technology (AICIT) 2024;():1-9 2024 DOI: 10.1109/AICIT62434.2024.10730175 · Ref ID: 7022 To address the inherent limitation of pre-trained language models regarding factual knowledge, current efforts encompass a variety of methods aimed at bolstering their capabilities through the integration of knowledge graphs as external sources. This augmentation seeks to enhance their performance across knowledge-driven tasks. However, the challenges of effectively encapsulating entity knowledge and mitigating the storage overhead associated with external knowledge persist. In this paper, we present a novel approach for representing entity knowledge. Our method leverages the relational context surrounding entities, departing from the conventional practice of employing distinct vector representations for each entity. Specifically, we propose a transformation of entity-level subgraphs into line graphs, allowing us to explicitly capture and model relational patterns inherent in entity adjacencies. In contrast to the original graph-based representation, our line graph-based model exhibits a heightened capacity to capture intricate knowledge structures. Through empirical evaluation across three downstream tasks - namely, relation extraction, entity typing, and question answering over knowledge graphs - we substantiate the efficacy of our approach. The experimental results demonstrate the superior performance of our model over prevailing state-of-the-art methodologies across the majority of tasks. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3168
-
Ge 2024
WorldGPT: Empowering LLM as Multimodal World Model
Proceedings of the 32nd ACM International Conference on Multimedia 2024;():7346–7355 Melbourne VIC, Australia Association for Computing Machinery 2024 DOI: 10.1145/3664647.3681488 · Ref ID: 7203 |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3321
-
Gema 2024
DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucinations
arXiv 2024;(): 2024 Ref ID: 8749 Large Language Models (LLMs) often hallucinate, producing unfaithful or factually incorrect outputs by misrepresenting the provided context or incorrectly recalling internal knowledge. Recent studies have identified specific attention heads within the Transformer architecture, known as retrieval heads, responsible for extracting relevant contextual information. We hypothesise that masking these retrieval heads can induce hallucinations and that contrasting the outputs of the base LLM and the masked LLM can reduce hallucinations. To this end, we propose Decoding by Contrasting Retrieval Heads (DeCoRe), a novel training-free decoding strategy that amplifies information found in the context and model parameters. DeCoRe mitigates potentially hallucinated responses by dynamically contrasting the outputs of the base LLM and the masked LLM, using conditional entropy as a guide. Our extensive experiments confirm that DeCoRe significantly improves performance on tasks requiring high contextual faithfulness, such as summarisation (XSum by 18.6%), instruction following (MemoTrap by 10.9%), and open-book question answering (NQ-Open by 2.4% and NQ-Swap by 5.5%). |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#234
-
Gerritse 2022
Entity-aware Transformers for Entity Search
45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) 2022;():1455-1465 Madrid, SPAIN Assoc Computing Machinery 2022 DOI: 10.1145/3477495.3531971 · Ref ID: 3290 Pre-trained language models such as BERT have been a key ingredient to achieve state-of-the-art results on a variety of tasks in natural language processing and, more recently, also in information retrieval. Recent research even claims that BERT is able to capture factual knowledge about entity relations and properties, the information that is commonly obtained from knowledge graphs. This paper investigates the following question: Do BERT-based entity retrieval models benefit from additional entity information stored in knowledge graphs? To address this research question, we map entity embeddings into the same input space as a pre-trained BERT model and inject these entity embeddings into the BERT model. This entity-enriched language model is then employed on the entity retrieval task. We show that the entity-enriched BERT model improves effectiveness on entity-oriented queries over a regular BERT model, establishing a new state-of-the-art result for the entity retrieval task, with substantial improvements for complex natural language queries and queries requesting a list of entities with a certain property. Additionally, we show that the entity information provided by our entity-enriched model particularly helps queries related to less popular entities. Last, we observe empirically that the entity-enriched BERT models enable fine-tuning on limited training data, which otherwise would not be feasible due to the known instabilities of BERT in few-sample fine-tuning, thereby contributing to data-efficient training of BERT for entity search. |
yuexi
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1322
-
Ghanem 2024
Fine-Tuning vs. Prompting: Evaluating the Knowledge Graph Construction with LLMs
CEUR Workshop Proceedings 2024;3747():18 CEUR-WS 2024 Ref ID: 4334 This paper explores Text-to-Knowledge Graph (T2KG) construction„ assessing Zero-Shot Prompting (ZSP), Few-Shot Prompting (FSP), and Fine-Tuning (FT) methods with Large Language Models (LLMs). Through comprehensive experimentation with Llama2, Mistral, and Starling, we highlight the strengths of FT, emphasize dataset size’s role, and introduce nuanced evaluation metrics. Promising perspectives include synonym-aware metric refinement, and data augmentation with LLMs. The study contributes valuable insights to KG construction methodologies, setting the stage for further advancements. © 2024 Copyright for this paper by its authors. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1592
-
Ghassabi 2023
Leveraging Knowledge Graphs for Matching Heterogeneous Entities and Explanation
Proceedings - 2023 IEEE International Conference on Big Data, BigData 2023 2023;():2910-2919 Institute of Electrical and Electronics Engineers Inc. 2023 DOI: 10.1109/BigData59044.2023.10386157 · Ref ID: 4909 Entity matching (EM), also known as record linkage, is crucial in data integration, cleaning, and knowledge base construction. Modern matching techniques leverage deep learning and pre-trained language models (PLMs) to effectively identify matching records, showcasing significant advancements over traditional methods. However, certain critical matching aspects have received limited attention in these techniques. They heavily rely on PLMs' encodings and face challenges in integrating external sources of knowledge to enhance matching accuracy. Additionally, these techniques often lack transparency, impeding users' understanding of the underlying rationale for matching decisions. Furthermore, they exhibit limitations and decreased performance in handling heterogeneous records from datasets with diverse schemas. This paper presents EXKG, a novel technique that addresses these challenges and effectively matches heterogeneous records with varying attributes. EXKG combines the power of knowledge graphs (KGs) and PLMs to perform record linkage while offering explanatory insights into the matching results. We demonstrate that EXKG achieves competitive performance through experimental studies compared to state-of-the-art matching techniques. As a by-product, our solution generates explanations that give end users a comprehensive understanding of the matching process. We evaluate the quality of these explanations by using a user study and show they empower end users to make informed decisions © 2023 IEEE. |
Kwesi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#41
-
Gilbert 2024
Augmented non-hallucinating large language models as medical information curators
Reliably processing and interlinking medical information has been recognized as a critical foundation to the digital transformation of medical workflows, and despite the development of medical ontologies, the optimization of these has been a major bottleneck to digital medicine. The advent of large language models has brought great excitement, and maybe a solution to the medicines' 'communication problem' is in sight, but how can the known weaknesses of these models, such as hallucination and non-determinism, be tempered? Retrieval Augmented Generation, particularly through knowledge graphs, is an automated approach that can deliver structured reasoning and a model of truth alongside LLMs, relevant to information structuring and therefore also to decision support. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2464
-
Gómez-Pérez 2013
A Formalism and Method for Representing and Reasoning with Process Models Authored by Subject Matter Experts
IEEE Transactions on Knowledge and Data Engineering 2013;25(9):1933-1945 2013 DOI: 10.1109/TKDE.2012.127 · Ref ID: 6019 Enabling Subject Matter Experts (SMEs) to formulate knowledge without the intervention of Knowledge Engineers (KEs) requires providing SMEs with methods and tools that abstract the underlying knowledge representation and allow them to focus on modeling activities. Bridging the gap between SME-authored models and their representation is challenging, especially in the case of complex knowledge types like processes, where aspects like frame management, data, and control flow need to be addressed. In this paper, we describe how SME-authored process models can be provided with an operational semantics and grounded in a knowledge representation language like F-logic to support process-related reasoning. The main results of this work include a formalism for process representation and a mechanism for automatically translating process diagrams into executable code following such formalism. From all the process models authored by SMEs during evaluation 82 percent were well formed, all of which executed correctly. Additionally, the two optimizations applied to the code generation mechanism produced a performance improvement at reasoning time of 25 and 30 percent with respect to the base case, respectively. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#828
-
Gong 2020
Towards Knowledge Enhanced Language Model for Machine Reading Comprehension
Machine reading comprehension is a crucial and challenging task in natural language processing (NLP). Recently, knowledge graph (KG) embedding has gained massive attention as it can effectively provide side information for downstream tasks. However, most previous knowledge-based models do not take into account the structural characteristics of the triples in KGs, and only convert them into vector representations for direct accumulation, leading to deficiencies in knowledge extraction and knowledge fusion. In order to alleviate this problem, we propose a novel deep model KCF-NET, which incorporates knowledge graph representations with context as the basis for predicting answers by leveraging capsule network to encode the intrinsic spatial relationship in triples of KG. In KCF-NET, we fine-tune BERT, a highly performance contextual language representation model, to capture complex linguistic phenomena. Besides, a novel fusion structure based on multi-head attention mechanism is designed to balance the weight of knowledge and context. To evaluate the knowledge expression and reading comprehension ability of our model, we conducted extensive experiments on multiple public datasets such as WN11, FB13, SemEval-2010 Task 8 and SQuAD. Experimental results show that KCF-NET achieves state-of-the-art results in both link prediction and MRC tasks with negligible parameter increase compared to BERT-Base, and gets competitive results in triple classification task with significantly reduced model size. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3019
-
González-de-Aledo 2017
Towards a Verification Flow Across Abstraction Levels Verifying Implementations Against Their Formal Specification
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 2017;36(3):475-488 2017 DOI: 10.1109/TCAD.2016.2611494 · Ref ID: 6578 The use of formal models to describe early versions of the structure and the behavior of a system has become common practice in industry. UML and OCL are the de-facto specification languages for these tasks. They allow for capturing system properties and module behavior in an abstract but still formal fashion. At the same time, this enables designers to detect errors or inconsistencies in the initial phases of the design flow-even if the implementation has not already started. Corresponding tools for verification of formal models got established in the recent past. However, verification results are usually not reused in later design steps anymore. In fact, similar verification tasks are applied again, e.g., after the implementation has been completed. This is a waste of computational and human effort. In this paper, we address this problem by proposing a method which checks a given implementation of a system against its corresponding formal method. This allows for transferring verification results already obtained from the formal model to the implementation and, eventually, motivates a new design flow which addresses verification across abstraction levels. This paper describes the applied techniques as well as their orchestration. Afterwards, the applicability of the proposed methodology is demonstrated by means of examples as well as a case study from an industrial context. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#218
-
Gonzalez-Garcia 2024
Enhancing knowledge graphs with microdata and LLMs: the case of Schema.org and Wikidata in touristic information
PurposeKnowledge graphs (KGs) are structured knowledge bases that represent real-world entities and are used in a variety of applications. Many of them are created and curated from a combination of automated and manual processes. Microdata embedded in Web pages for purposes of facilitating indexing and search engine optimization are a potential source to augment KGs under some assumptions of complementarity and quality that have not been thoroughly explored to date. In that direction, this paper aims to report results on a study that evaluates the potential of using microdata extracted from the Web to augment the large, open and manually curated Wikidata KG for the domain of touristic information. As large corpora of Web text is currently being leveraged via large language models (LLMs), these are used to compare the effectiveness of the microdata enhancement method.Design/methodology/approachThe Schema.org taxonomy was used as the source to determine the annotation types to be collected. Here, the authors focused on tourism-related pages as a case study, selecting the relevant Schema.org concepts as point of departure. The large CommonCrawl resource was used to select those annotations from a large recent sample of the World Wide Web. The extracted annotations were processed and matched with Wikidata to estimate the degree to which microdata produced for SEO might become a valuable resource to complement KGs or vice versa. The Web pages themselves can also serve as a context to produce additional metadata elements using them as context in pipelines of an existing LLMs. That way, both the annotations and the contents itself can be used as sources.FindingsThe samples extracted revealed a concentration of metadata annotations in only a few of the relevant Schema.org attributes and also revealed the possible influence of authoring tools in a significant fraction of microdata produced. The analysis of the overlapping of attributes in the sample with those of Wikidata showed the potential of the technique, limited by the disbalance of the presence of attributes. The combination of those with the use of LLMs to produce additional annotations demonstrates the feasibility of the approach in the population of existing Wikidata locations. However, in both cases, the effectiveness appears to be lower in the cases of less content in the KG, which are arguably the most relevant when considering the scenario of an automated population approach.Originality/valueThe research reports novel empirical findings on the way touristic annotations with a SEO orientation are being produced in the wild and provides an assessment of their potential to complement KGs, or reuse information from those graphs. It also provides insights on the potential of using LLMs for the task. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#231
-
González 2020
Entity Linking as a Population Mechanism for Skill Ontologies: Evaluating the Use of ESCO and Wikidata
14th International Conference on Metadata and Semantic Research (MTSR) 2020;1355():116-122 Madrid, SPAIN Springer International Publishing Ag 2020 DOI: 10.1007/978-3-030-71903-6_12 · Ref ID: 3348 Ontologies or databases describing occupations in terms of competences or skills are an important resource for a number of applications. Exploiting large knowledge graphs thus becomes a promising direction to update those ontologies with entities of the latter, which may be updated faster, especially in the case of crowd-sourced resources. Here we report a first assessment of the potential of that strategy matching knowledge elements in ESCO to Wikidata using NER and document similarity models available at the Spacy NLP libraries. Results show that the approach may be effective, but the use of pre-trained language models and the short texts included with entities (labels and descriptions) does not result in sufficient quality for a fully automated process. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#779
-
Gosselin 2023
SORBET: A Siamese Network for Ontology Embeddings Using a Distance-Based Regression Loss and BERT
22nd International Semantic Web Conference (ISWC) 2023;14265():561-578 Athens, GREECE Springer International Publishing Ag 2023 DOI: 10.1007/978-3-031-47240-4_30 · Ref ID: 3381 Ontology embedding methods have been popular in recent years, especially when it comes to representation learning algorithms for solving ontology-related tasks. Despite the impact of large language models on knowledge graphs' related tasks, there has been less focus on adapting these models to construct ontology embeddings that are both semantically relevant and faithful to the ontological structure. In this paper, we present a novel ontology embedding method that encodes ontology classes into a pre-trained SBERT through random walks and then fine-tunes the embeddings using a distance-based regression loss. We benchmark our algorithm on four different datasets across two tasks and show the impact of transfer learning and our distance-based loss on the quality of the embeddings. Our results show that SORBET outperform state-of-the-art ontology embedding techniques for the performed tasks. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#25
-
Gottlob 2024
Artificial Intelligence and Artificial Ignorance
32nd EACSL Annual Conference on Computer Science Logic (CSL) 2024;288(): Naples, ITALY Schloss Dagstuhl, Leibniz Center Informatics 2024 DOI: 10.4230/LIPIcs.CSL.2024.3 · Ref ID: 3638 This invited talk first delves into the division between the two primary branches of AI research: symbolic AI, which predominantly focuses on knowledge representation and logical reasoning, and sub-symbolic AI, primarily centered on machine learning employing neural networks. We explore both the notable accomplishments and the challenges encountered in each of these approaches. We provide instances where traditional deep learning encounters limitations, and we elucidate significant obstacles in achieving automated symbolic reasoning. We then discuss the recent groundbreaking advancements in generative AI, driven by language models such as ChatGPT. We showcase instances where these models excel and, conversely, where they exhibit shortcomings and produce erroneous information. We identify and illustrate five key reasons for potential failures in language models, which include: (i) information loss due to data compression, (ii) training bias, (iii) the incorporation of incorrect external data, (iv) the misordering of results, and (v) the failure to detect and resolve logical inconsistencies contained in a sequence of LLM-generated prompt-answers. Lastly, we touch upon the Chat2Data project, which endeavors to leverage language models for the automated verification and enhancement of relational databases, all while mitigating the pitfalls (i)-(v) mentioned earlier. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1603
-
Gou 2023
A lightweight biomedical named entity recognition with pre-trained model
2023 IEEE 3rd International Conference on Data Science and Computer Application, ICDSCA 2023 2023;():117-121 Institute of Electrical and Electronics Engineers Inc. 2023 DOI: 10.1109/ICDSCA59871.2023.10392374 · Ref ID: 4976 Biomedical Named Entity Recognition (BioNER) is a specialized subfield of Named Entity Recognition (NER) that focuses on identifying and classifying named entities in biomedical and clinical texts. The goal of BioNER is to extract essential information, such as genes, proteins, diseases, drugs, et al., from scientific literature, electronic health records (EHRs), biomedical databases, and other biomedical text sources. The recognition and classification of these entities are crucial for various biomedical and healthcare-related tasks, including information retrieval, data integration, knowledge extraction, and drug discovery. Traditional BioNER methods typically involve rule-based approaches or machine learning algorithms, et al. These methods have been widely used before the advent of deep learning and transformer-based models. Bidirectional Encoder Representations from Transformers (BERT) is a groundbreaking transformer-based language model. It has revolutionized various natural language processing (NLP) tasks by capturing contextual information and obtain optimal results in multiple benchmarks. A lightweight BioNER optimized model from traditional BERT (LWNER) is proposed in this study, which can capture contextual information and its knowledge transfer from pre-training on large-scale text corpora without relying heavily on feature engineering and handcrafted rules. Fine-tuning BERT on biomedical-specific data helps adapt the model to the nuances and terminology of the biomedical domain. We conduct the method LWNER on BioCreative dataset, BC2GM, BC4CHEMD, BC5CDR, especially the chemical entity in BC5CDR achieve F1- score 91.3%. We construct an online web tool based on LWNER to identify the arbitrary text from scientific literatures for building knowledge graph. © 2023 IEEE. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#532
-
Gouidis 2024
LLM-aided Knowledge Graph construction for Zero-Shot Visual Object State Classification
14th International Conference on Pattern Recognition Systems (ICPRS) 2024;(): London, ENGLAND Ieee 2024 DOI: 10.1109/icprs62101.2024.10677802 · Ref ID: 3132 The problem of classifying the states of objects using visual information holds great importance in both applied and theoretical contexts. This work focuses on the special case of Zero-shot Object-Agnostic State Classification (ZS-OaSC). To tackle this problem, we introduce an innovative strategy that capitalizes on the capabilities of Graph Neural Networks to learn to project semantic embeddings into visual space and on the potential of Large Language Models (LLMs) to provide rich content for constructing Knowledge Graphs (KGs). Through a comprehensive ablation study, we explore the synergies between LLMs and KGs, uncovering critical insights about their integration in the context of the ZS-OSC problem. Our proposed methodology is rigorously evaluated against current state-of-the-art (SoA) methods, demonstrating superior performance in various image datasets. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1436
-
Graham 2023
Investigating antiquities trafficking with generative pre-trained transformer (GPT)-3 enabled knowledge graphs: A case study
Background: There is a wide variety of potential sources from which insight into the antiquities trade could be culled, from newspaper articles to auction catalogues, to court dockets, to personal archives, if it could all be systematically examined. We explore the use of a large language model, GPT-3, to semi-automate the creation of a knowledge graph of a body of scholarship concerning the antiquities trade. Methods: We give GPT-3 a prompt guiding it to identify knowledge statements around the trade. Given GPT-3’s understanding of the statistical properties of language, our prompt teaches GPT-3 to append text to each article we feed it where the appended text summarizes the knowledge in the article. The summary is in the form of a list of subject, predicate, and object relationships, representing a knowledge graph. Previously we created such lists by manually annotating the source articles. We compare the result of this automatic process with a knowledge graph created from the same sources via hand. When such knowledge graphs are projected into a multi-dimensional embedding model using a neural network (via the Ampligraph open-source Python library), the relative positioning of entities implies the probability of a connection; the direction of the positioning implies the kind of connection. Thus, we can interrogate the embedding model to discover new probable relationships. The results can generate new insight about the antiquity trade, suggesting possible avenues of research. Results: We find that our semi-automatic approach to generating the knowledge graph in the first place produces comparable results to our hand-made version, but at an enormous savings of time and a possible expansion of the amount of materials we can consider. Conclusions: These results have implications for working with other kinds of archaeological knowledge in grey literature, reports, articles, and other venues via computational means. Copyright: © 2023 Graham S et al. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2650
-
Graupner 2009
Making processes from best practice frameworks actionable
2009 13th Enterprise Distributed Object Computing Conference Workshops 2009;():25-34 2009 DOI: 10.1109/EDOCW.2009.5332021 · Ref ID: 6860 Best-practice frameworks provide guidance for organizing work in business. They enable reuse of experience within a domain. However, best practice frameworks are general and usually cover broad domains. Their guidance thus is often offered at an abstract level rather than as details of actionable tasks and processes to accomplish work. This paper presents an approach to bridge the gap between the abstractions available in best practice framework and actions that have to be performed by people or systems in a repeatable manner. We identify knowledge from best practices frameworks, categorize it and represent it in the form of reusable, interpretable templates. Template interpretation guides the refinement process from general concepts of best practices frameworks into actionable concepts such as specific tasks to be performed by assigned roles. A prototype implemented to validate the approach is also described. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#604
-
Gromann 2020
Neural language models for the multilingual, transcultural, and multimodal Semantic Web
A vision of a truly multilingual Semantic Web has found strong support with the Linguistic Linked Open Data community. Standards, such as OntoLex-Lemon, highlight the importance of explicit linguistic modeling in relation to ontologies and knowledge graphs. Nevertheless, there is room for improvement in terms of automation, usability, and interoperability. Neural Language Models have achieved several breakthroughs and successes considerably beyond Natural Language Processing (NLP) tasks and recently also in terms of multimodal representations. Several paths naturally open up to port these successes to the Semantic Web, from automatically translating linguistic information associated with structured knowledge resources to multimodal question-answering with machine translation. Language is also an important vehicle for culture, an aspect that deserves considerably more attention. Building on existing approaches, this article envisions joint forces between Neural Language Models and Semantic Web technologies for multilingual, transcultural, and multimodal information access and presents open challenges and opportunities in this direction. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1980
-
Gromann 2019
Towards the detection and formal representation of semantic shifts in inflectional morphology
OpenAccess Series in Informatics 2019;70(): Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing 2019 DOI: 10.4230/OASIcs.LDK.2019.21 · Ref ID: 5727 Semantic shifts caused by derivational morphemes is a common subject of investigation in language modeling, while inflectional morphemes are frequently portrayed as semantically more stable. This study is motivated by the previously established observation that inflectional morphemes can be just as variable as derivational ones. For instance, the English plural “-s” can turn the fabric silk into the garments of a jockey, silks. While humans know that silk in this sense has no plural, it takes more for machines to arrive at this conclusion. Frequently utilized computational language resources, such as WordNet, or models for representing computational lexicons, like OntoLex-Lemon, have no descriptive mechanism to represent such inflectional semantic shifts. To investigate this phenomenon, we extract word pairs of different grammatical number from WordNet that feature additional senses in the plural and evaluate their distribution in vector space, i.e., pre-trained word2vec and fastText embeddings. We then propose an extension of OntoLex-Lemon to accommodate this phenomenon that we call inflectional morpho-semantic variation to provide a formal representation accessible to algorithms, neural networks, and agents. While the exact scope of the problem is yet to be determined, this first dataset shows that it is not negligible. © Dagmar Gromann and Thierry Declerck. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1842
-
Gu 2024
RRE: A Relevance Relation Extraction Framework for Cross-domain Recommender System at Alipay
Proceedings - IEEE International Conference on Multimedia and Expo 2024;(): IEEE Computer Society 2024 DOI: 10.1109/ICME57554.2024.10687762 · Ref ID: 4174 Prevailing embedding-based cross-domain recommendation (CDR) techniques produce embeddings individually or transfer the overall feature distribution from one domain to another. However, in real-world applications, they might be ineffective due to semantic gap across domains, which arises from divergent purposes and descriptive styles. In this work, we aim to address this challenge between Mini Program and content channel in Alipay, the largest mobile payment platform in China. To bridge utility-oriented Mini Programs and advertisement-oriented contents, we utilize side information of entities to make the entity relevance scores trustworthy. Then we introduce a knowledge graph-based model to reduce the impact of embedding vibrating from contrastive learning and the biases from the pretrained language models. Extensive experiments conducted on a large-scale Alipay offline dataset as well as an online environment demonstrated the effectiveness of our proposed framework. © 2024 IEEE. |
Davis
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3340
-
Gu 2023
Distilling Large Language Models for Biomedical Knowledge Extraction: A Case Study on Adverse Drug Events
arXiv 2023;(): 2023 Ref ID: 7777 Large language models (LLMs), such as GPT-4, have demonstrated remarkable capabilities across a wide range of tasks, including health applications. In this paper, we study how LLMs can be used to scale biomedical knowledge curation. We find that while LLMs already possess decent competency in structuring biomedical text, by distillation into a task-specific student model through self-supervised learning, substantial gains can be attained over out-of-box LLMs, with additional advantages such as cost, efficiency, and white-box model access. We conduct a case study on adverse drug event (ADE) extraction, which is an important area for improving care. On standard ADE extraction evaluation, a GPT-3.5 distilled PubMedBERT model attained comparable accuracy as supervised state-of-the-art models without using any labeled data. Despite being over 1,000 times smaller, the distilled model outperformed its teacher GPT-3.5 by over 6 absolute points in F1 and GPT-4 by over 5 absolute points. Ablation studies on distillation model choice (e.g., PubMedBERT vs BioGPT) and ADE extraction architecture shed light on best practice for biomedical knowledge extraction. Similar gains were attained by distillation for other standard biomedical knowledge extraction tasks such as gene-disease associations and protected health information, further illustrating the promise of this approach. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3867
-
Gu 2023
Sem4SAP: Synonymous Expression Mining From Open Knowledge Graph For Language Model Synonym-Aware Pretraining
arXiv 2023;(): 2023 Ref ID: 7664 The model's ability to understand synonymous expression is crucial in many kinds of downstream tasks. It will make the model to better understand the similarity between context, and more robust to the synonym substitution attack. However, many Pretrained Language Model (PLM) lack synonym knowledge due to limitation of small-scale synsets and PLM's pretraining objectives. In this paper, we propose a framework called Sem4SAP to mine synsets from Open Knowledge Graph (Open-KG) and using the mined synsets to do synonym-aware pretraining for language models. We propose to coarsly filter the content in Open-KG and use the frequency information to better help the clustering process under low-resource unsupervised conditions. We expand the mined synsets by migrating core semantics between synonymous expressions.We also propose two novel and effective synonym-aware pre-training methods for injecting synonym knowledge into PLMs.Extensive experiments demonstrate that Sem4SAP can dramatically outperform the original PLMs and other baselines on ten different tasks. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3393
-
Gunaratna 2021
Entity Context Graph: Learning Entity Representations fromSemi-Structured Textual Sources on the Web
arXiv 2021;(): 2021 Ref ID: 7446 Knowledge is captured in the form of entities and their relationships and stored in knowledge graphs. Knowledge graphs enhance the capabilities of applications in many different areas including Web search, recommendation, and natural language understanding. This is mainly because, entities enable machines to understand things that go beyond simple tokens. Many modern algorithms use learned entity embeddings from these structured representations. However, building a knowledge graph takes time and effort, hence very costly and nontrivial. On the other hand, many Web sources describe entities in some structured format and therefore, finding ways to get them into useful entity knowledge is advantageous. We propose an approach that processes entity centric textual knowledge sources to learn entity embeddings and in turn avoids the need for a traditional knowledge graph. We first extract triples into the new representation format that does not use traditional complex triple extraction methods defined by pre-determined relationship labels. Then we learn entity embeddings through this new type of triples. We show that the embeddings learned from our approach are: (i) high quality and comparable to a known knowledge graph-based embeddings and can be used to improve them further, (ii) better than a contextual language model-based entity embeddings, and (iii) easy to compute and versatile in domain-specific applications where a knowledge graph is not readily available |
yuexi
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#562
-
Guo 2024
Memory-Enhanced Knowledge Reasoning with Reinforcement Learning
In recent years, the emergence of large-scale language models, such as ChatGPT, has presented significant challenges to research on knowledge graphs and knowledge-based reasoning. As a result, the direction of research on knowledge reasoning has shifted. Two critical issues in knowledge reasoning research are the algorithm of the model itself and the selection of paths. Most studies utilize LSTM as the path encoder and memory module. However, when processing long sequence data, LSTM models may encounter the problem of long-term dependencies, where memory units of the model may decay gradually with an increase in time steps, leading to forgetting earlier input information. This can result in a decline in the performance of the LSTM model in long sequence data. Additionally, as the data volume and network depth increase, there is a risk of gradient disappearance. This study improved and optimized the LSTM model to effectively address the problems of gradient explosion and gradient disappearance. An attention layer was employed to alleviate the issue of long-term dependencies, and ConvR embedding was used to guide path selection and action pruning in the reinforcement learning inference model. The overall model achieved excellent reasoning results. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1216
-
Guo 2022
Dynamic Knowledge Integration for Natural Language Inference
Proceedings - 2022 4th International Conference on Natural Language Processing, ICNLP 2022 2022;():360-364 Institute of Electrical and Electronics Engineers Inc. 2022 DOI: 10.1109/ICNLP55136.2022.00066 · Ref ID: 5483 Natural language inference (NLI) aims to determine the entailment relationship between the premise and the hypothesis. It is a fundamental but difficult problem, since there may exists serious semantic and logistic gap between the premise and the hypothesis. Despite using strong pre-trained language model (PLM), previous work performs poorly on complicated reasoning for knowledge-sensitive cases ignoring the integration of external knowledge. We propose a dynamic knowledge integration strategy for NLI, where knowledge from multiple knowledge graphs (KGs) can be dynamically integrated. For each KG, it transforms input tokens into a graph according to the connectivity of the related entities. All the graphs are encoded by a group of parallel graph neural networks (GNNs), and after each layer the intermediate results are integrated dynamically by being conditioned on the input text. This strategy also facilitates the incorporation of PLM, simply by treating the input tokens as a fully connected graph and adapting the PLM outputs as the node embeddings. Experiments on SNLI, MNLI and SciTail show that, the dynamic integration of knowledge from WordNet and ConceptNet achieves significant improvements over the strongest baseline built upon RoBERTa. © 2022 IEEE. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#556
-
Guo 2022
A medical question answering system using large language models and knowledge graphs
Question answering systems have become prominent in all areas, while in the medical domain it has been challenging because of the abundant domain knowledge. Retrieval based approach has become promising as large pretrained language models come forth. This study focuses on building a retrieval-based medical question answering system, tackling the challenge with large language models and knowledge extensions via graphs. We first retrieve an extensive but coarse set of answers via Elasticsearch efficiently. Then, we utilize semantic matching with pretrained language models to achieve a fine-grained ranking enhanced with named entity recognition and knowledge graphs to exploit the relation of the entities in question and answer. A new architecture based on siamese structures for answer selection is proposed. To evaluate the approach, we train and test the model on two Chinese data sets, NLPCC2017 and cMedQA. We also conduct experiments on two English data sets, TREC-QA and WikiQA. Our model achieves consistent improvement as compared to strong baselines on all data sets. Qualification studies with cMedQA and our in-house data set show that our system gains highly competitive performance. The proposed medical question answering system outperforms baseline models and systems in quantification and qualification evaluations. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1249
-
Guo 2024
Enhancing Commonsense Reasoning through Entity Type Knowledge Graph Completion
2024 5th International Conference on Artificial Intelligence and Electromechanical Automation, AIEA 2024 2024;():298-302 Institute of Electrical and Electronics Engineers Inc. 2024 DOI: 10.1109/AIEA62095.2024.10692845 · Ref ID: 4122 Entity type knowledge graphs play a crucial role in enhancing the commonsense reasoning capabilities of pre-trained models, but their incomplete information often hinders the performance of these pre-trained language models. We propose a novel method that enhances commonsense reasoning by supplementing missing entity types in knowledge graphs through the aggregation of single-hop and multi-hop neighbor information. Our approach consists of three main components: aggregating neighbor information, inferring missing entity types using both local and global reasoning, and predicting the final entity types based on a combined scoring mechanism. We demonstrate the effectiveness of our method on two widely recognized datasets, CommonsenseQA and OpenBookQA. Notably, on the OpenBookQA dataset, enhancing the BART pre-trained model with the completed entity type knowledge graph improved its accuracy from 82.8% to 87.4% compared to using the original, incomplete knowledge graph. Experimental results indicate that enriching entity type information significantly enhances the ability of pretrained models to leverage implicit commonsense knowledge, particularly in tasks requiring a deep understanding of entity relationships. © 2024 IEEE. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#419
-
Gupta 2021
Knowledge Based Deep Inception Model for Web Page Classification
Web Page Classification is decisive for information retrieval and management task and plays an imperative role for natural language processing (NLP) problems in web engineering. Traditional machine learning algorithms excerpt covet features from web pages whereas deep leaning algorithms crave features as the network goes deeper. Pre-trained models such as BERT attains remarkable achievement for text classification and continue to show state-of-the-art results. Knowledge Graphs can provide rich structured factual information for better language modelling and representation. In this study, we proposed an ensemble Knowledge Based Deep Inception (KBDI) approach for web page classification by learning bidirectional contextual representation using pre-trained BERT incorporating Knowledge Graph embeddings and fine-tune the target task by applying Deep Inception network utilizing parallel multi-scale semantics. Proposed ensemble evaluates the efficacy of fusing domain specific knowledge embeddings with the pre-trained BERT model. Experimental interpretation exhibit that the proposed BERT fused KBDI model outperforms benchmark baselines and achieve better performance in contrast to other conventional approaches evaluated on web page classification datasets. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#969
-
Gurgurov 2024
Adapting Multilingual LLMs to Low-Resource Languages with Knowledge Graphs via Adapters
KaLLM 2024 - 1st Workshop on Knowledge Graphs and Large Language Models, Proceedings of the Workshop 2024;():63-74 Association for Computational Linguistics (ACL) 2024 Ref ID: 4346 This paper explores the integration of graph knowledge from linguistic ontologies into multilingual Large Language Models (LLMs) using adapters to improve performance for low-resource languages (LRLs) in sentiment analysis (SA) and named entity recognition (NER). Building upon successful parameter-efficient fine-tuning techniques, such as K-ADAPTER (Wang et al., 2021) and MAD-X (Pfeiffer et al., 2020), we propose a similar approach for incorporating knowledge from multilingual graphs, connecting concepts in various languages with each other through linguistic relationships, into multilingual LLMs for LRLs. Specifically, we focus on eight LRLs —Maltese, Bulgarian, Indonesian, Nepali, Javanese, Uyghur, Tibetan, and Sinhala — and employ language-specific adapters fine-tuned on data extracted from the language-specific section of ConceptNet, aiming to enable knowledge transfer across the languages covered by the knowledge graph. We compare various fine-tuning objectives, including standard Masked Language Modeling (MLM), MLM with full-word masking, and MLM with targeted masking, to analyze their effectiveness in learning and integrating the extracted graph data. Through empirical evaluation on language-specific tasks, we assess how structured graph knowledge affects the performance of multilingual LLMs for LRLs in SA and NER, providing insights into the potential benefits of adapting language models for low-resource scenarios. ©2024 Association for Computational Linguistics. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3497
-
Gutiérrez 2024
HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models
arXiv 2024;(): 2024 Ref ID: 8313 In order to thrive in hostile and ever-changing natural environments, mammalian brains evolved to store large amounts of knowledge about the world and continually integrate new information while avoiding catastrophic forgetting. Despite the impressive accomplishments, large language models (LLMs), even with retrieval-augmented generation (RAG), still struggle to efficiently and effectively integrate a large amount of new experiences after pre-training. In this work, we introduce HippoRAG, a novel retrieval framework inspired by the hippocampal indexing theory of human long-term memory to enable deeper and more efficient knowledge integration over new experiences. HippoRAG synergistically orchestrates LLMs, knowledge graphs, and the Personalized PageRank algorithm to mimic the different roles of neocortex and hippocampus in human memory. We compare HippoRAG with existing RAG methods on multi-hop question answering and show that our method outperforms the state-of-the-art methods remarkably, by up to 20%. Single-step retrieval with HippoRAG achieves comparable or better performance than iterative retrieval like IRCoT while being 10-30 times cheaper and 6-13 times faster, and integrating HippoRAG into IRCoT brings further substantial gains. Finally, we show that our method can tackle new types of scenarios that are out of reach of existing methods. Code and data are available at https://github.com/OSU-NLP-Group/HippoRAG. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1859
-
Hahn 2019
Self-knowledge distillation in natural language processing
International Conference Recent Advances in Natural Language Processing, RANLP 2019;2019-September():423-430 Incoma Ltd 2019 DOI: 10.26615/978-954-452-056-4_050 · Ref ID: 5773 Since deep learning became a key player in natural language processing (NLP), many deep learning models have been showing remarkable performances in a variety of NLP tasks, and in some cases, they are even outperforming humans. Such high performance can be explained by efficient knowledge representation of deep learning models. While many methods have been proposed to learn more efficient representation, knowledge distillation from pretrained deep networks suggest that we can use more information from the soft target probability to train other neural networks. In this paper, we propose a new knowledge distillation method self-knowledge distillation, based on the soft target probabilities of the training model itself, where multimode information is distilled from the word embedding space right below the softmax layer. Due to the time complexity, our method approximates the soft target probabilities. In experiments, we applied the proposed method to two different and fundamental NLP tasks: language model and neural machine translation. The experiment results show that our proposed method improves performance on the tasks. © 2019 Association for Computational Linguistics (ACL). All rights reserved. |
Ishan
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1268
-
Han 2023
Enhancing the Effect of BERT Model in the Medical Field Based on the Knowledge Graph
Proceedings of SPIE - The International Society for Optical Engineering 2023;12724(): SPIE 2023 DOI: 10.1117/12.2687418 · Ref ID: 5290 Knowledge graph is a kind of knowledge representation, which captures information about entities, entity attributes and relationships between entities in a structured way, and is widely used in intelligent retrieval, recommendation systems, intelligent question answering, etc. Knowledge map is a graphical representation of the relationship between different concepts and topics in a specific field, while BERT is a most advanced language model that can understand the context and meaning of words in text. By combining the two methods of medical knowledge graph (MKG) and the bidirectional encoder representation of BERT model, it shows hope in improving medical information retrieval and decision-making, and can create a more comprehensive and accurate representation of medical knowledge, which can be used to guide clinical decision-making and improve the prognosis of patients, and ultimately improve the effect of BERT in the medical field. © 2023 SPIE. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2881
-
Hao 2023
Semantic Comprehension Method for Chinese Sentences Based on Minimal Semantic Structures and Its Application
The importance of small Chinese sentences is no less than that of sentences, which is an inherent feature of Chinese itself. According to this characteristic, this paper proposes a sentence semantic understanding method for Chinese scientific and technological abstracts based on the minimum semantic structure. Firstly, a conceptual model was established for identifying the minimum semantic structure of a sentence based on a corpus of verbs, relative words, prepositions and markers based on Language Technology Planform (LTP) tools. Secondly, the model was used to extract the minimum semantic structure of abstract sentence. Finally, three experiments were carried out, namely, the classification of the abstract sentences, knowledge graph generation and automatic semantic inference discovery. Our study confirmed the practical value of the small Chinese sentence. The experimental results show that the effect of using small sentences to understand the semantics of Chinese text is better than that of the full stop sentence, and the minimum semantic structure can be used as the basic unit of the Chinese sentence semantic comprehension. This method is conducive in the automatic understanding of the basic semantics of sentences in unstructured Chinese science and technology text sentences. |
Mike
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3070
-
Haque 2024
Utilizing Structural Metrics from Knowledge Graphs to Enhance the Robustness Quantification of Large Language Models (Extended Abstract)
2024 IEEE 11th International Conference on Data Science and Advanced Analytics (DSAA) 2024;():1-2 2024 DOI: 10.1109/DSAA61799.2024.10722791 · Ref ID: 6087 The goal of this study is to determine whether large language models (LLMs) like CodeLlama, Mistral, and Vicuna can be used to build knowledge graphs (KGs) from textual data. We create class descriptions for well-known KGs such as DBpedia, YAGO, and Google Knowledge Graph, from which we extract RDF triples and enhance these graphs using different preprocessing methods. Six structural quality measures are used in the study to compare the constructed and existing KGs. Our results demonstrate how important LLMs are to improving KG construction and provide insightful information for KG construction researchers. Moreover, an in-depth analysis of popular open-source LLM models enables researchers to identify the most efficient model for various tasks, ensuring optimal performance in specific applications. |
Xinchen
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#695
-
Harper 2022
Question Answering with Additive Restrictive Training (QuAART): Question Answering for the Rapid Development of New Knowledge Extraction Pipelines
23rd International Conference on Knowledge Engineering and Knowledge Management (EKAW) 2022;13514():51-65 Bozen Bolzano, ITALY Springer International Publishing Ag 2022 DOI: 10.1007/978-3-031-17105-5_4 · Ref ID: 3786 Numerous studies have explored the use of language models and question answering techniques for knowledge extraction. In most cases, these models are trained on data specific to the new task at hand. We hypothesize that using models trained only on generic question answering data (e.g. SQuAD) is a good starting point for domain specific entity extraction. We test this hypothesis, and explore whether the addition of small amounts of training data can help lift model performance. We pay special attention to the use of null answers and unanswerable questions to optimize performance. To our knowledge, no studies have been done to evaluate the effectiveness of this technique. We do so for an end-to-end entity mention detection and entity typing task on HAnDS and FIGER, two common evaluation datasets for fine grained entity recognition. We focus on fine-grained entity recognition because it is challenging scenario, and because the long tail of types in this task highlights the need for entity extraction systems that can deal with new domains and types. To our knowledge, we are the first system beyond those presented in the original FIGER and HAnDS papers to tackle the task in an end-to-end fashion. Using an extremely small sample from the distantly-supervised HAnDS training data - 0.0015%, or less than 500 passages randomly chosen out of 31 million - we produce a CoNNL F1 score of 73.72 for entity detection on FIGER. Our end-to-end detection and typing evaluation produces macro and micro Fls of 45.11 and 54.75, based on the FIGER evaluation metrics. This work provides a foundation for the rapid development of new knowledge extraction pipelines. |
Mike
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#39
-
Harrer 2023
Attention is not all you need: the complicated case of ethically using large language models in healthcare and medicine
Large Language Models (LLMs) are a key component of generative artificial intelligence (AI) applications for creating new content including text, imagery, audio, code, and videos in response to textual instructions. Without human oversight, guidance and responsible design and operation, such generative AI applications will remain a party trick with substantial potential for creating and spreading misinformation or harmful and inaccurate content at unprecedented scale. However, if positioned and developed responsibly as companions to humans augmenting but not replacing their role in decision making, knowledge retrieval and other cognitive processes, they could evolve into highly efficient, trustworthy, assistive tools for information management. This perspective describes how such tools could transform data management workflows in healthcare and medicine, explains how the underlying technology works, provides an assessment of risks and limitations, and proposes an ethical, technical, and cultural framework for responsible design, development, and deployment. It seeks to incentivise users, developers, providers, and regulators of generative AI that utilises LLMs to collectively prepare for the transformational role this technology could play in evidence-based sectors. |
yuexi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#61
-
He 2020
BERT-MK: Integrating Graph Contextualized Knowledge into Pre-trained Language Models
Meeting of the Association-for-Computational-Linguistics (ACL-EMNLP) 2020;():2281-2290 Electr Network Assoc Computational Linguistics-Acl 2020 Ref ID: 3159 Complex node interactions are common in knowledge graphs (KGs), and these interactions can be considered as contextualized knowledge exists in the topological structure of KGs. Traditional knowledge representation learning (KRL) methods usually treat a single triple as a training unit, neglecting the usage of graph contextualized knowledge. To utilize these unexploited graph-level knowledge, we propose an approach to model subgraphs in a medical KG. Then, the learned knowledge is integrated with a pre-trained language model to do the knowledge generalization. Experimental results demonstrate that our model achieves the state-of-the-art performance on several medical NLP tasks, and the improvement above MedERNIE indicates that graph contextualized knowledge is beneficial. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1650
-
He 2024
MoCoSA: Momentum Contrast for Knowledge Graph Completion with Structure-Augmented Pre-trained Language Models
Proceedings - IEEE International Conference on Multimedia and Expo 2024;(): IEEE Computer Society 2024 DOI: 10.1109/ICME57554.2024.10687798 · Ref ID: 4147 Knowledge Graph Completion (KGC) aims to conduct reasoning on the facts within knowledge graphs and automatically infer missing links. Existing methods can mainly be categorized into structure-based or description-based. Structure-based methods effectively represent relational facts in knowledge graphs using entity embeddings and description-based methods leverage pre-trained language models (PLMs) to understand textual information. In this paper, we propose Momentum Contrast for knowledge graph completion with Structure-Augmented pre-trained language models (MoCoSA), which allows the PLM to perceive the structural information by the adaptable structure encoder. We proposed momentum hard negative and intra-relation negative sampling to improve learning efficiency. Experimental results demonstrate that our approach achieves state-of-the-art performance in terms of mean reciprocal rank (MRR), with improvements of 2.5% on WN18RR and 21% on OpenBG500. © 2024 IEEE. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1550
-
He 2023
LAL-JER: Label-Aware Learning for Adaptive Joint Entity and Relation Extraction with LLM data augmentation
ACM International Conference Proceeding Series 2023;():414-419 Association for Computing Machinery 2023 DOI: 10.1145/3640912.3640993 · Ref ID: 4767 Joint entity and relation extraction has achieved great improvements in Natural Language Processing (NLP) and has been widely applied, such as constructing knowledge graph, query understanding and question answering. Existing methods usually spend long time on fitting the models on certain datasets with given label type, which greatly lacks the ability of generalization. The model cannot make prediction on label types that have not seen in the training set. To address this issue, we propose to use prompt to incorporate the semantic meaning of the label type description. Furthermore, we use large language model to perform data augmentation to improve the robustness of our model during training. Extensive experiments and ablation study on two joint entity and relation extraction validates the effectiveness of our work on that: 1. Our methods achieved states of art performance on joint entity and relation extraction benchmark based on pretrained language model bert. 2. Our methods can help the model make predictions on label type unseen before given prompts. © 2023 ACM. |
Kwesi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3743
-
He 2020
On the Role of Conceptualization in Commonsense Knowledge Graph Construction
arXiv 2020;(): 2020 Ref ID: 7390 Commonsense knowledge graphs (CKGs) like Atomic and ASER are substantially different from conventional KGs as they consist of much larger number of nodes formed by loosely-structured text, which, though, enables them to handle highly diverse queries in natural language related to commonsense, leads to unique challenges for automatic KG construction methods. Besides identifying relations absent from the KG between nodes, such methods are also expected to explore absent nodes represented by text, in which different real-world things, or entities, may appear. To deal with the innumerable entities involved with commonsense in the real world, we introduce to CKG construction methods conceptualization, i.e., to view entities mentioned in text as instances of specific concepts or vice versa. We build synthetic triples by conceptualization, and further formulate the task as triple classification, handled by a discriminatory model with knowledge transferred from pretrained language models and fine-tuned by negative sampling. Experiments demonstrate that our methods can effectively identify plausible triples and expand the KG by triples of both new nodes and edges of high diversity and novelty. |
Xinchen
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#486
-
He 2023
KRP-DS: A Knowledge Graph-Based Dialogue System with Inference-Aided Prediction
With the popularity of ChatGPT, there has been increasing attention towards dialogue systems. Researchers are dedicated to designing a knowledgeable model that can engage in conversations like humans. Traditional seq2seq dialogue models often suffer from limited performance and the issue of generating safe responses. In recent years, large-scale pretrained language models have demonstrated their powerful capabilities across various domains. Many studies have leveraged these pretrained models for dialogue tasks to address concerns such as safe response generation. Pretrained models can enhance responses by carrying certain knowledge information after being pre-trained on large-scale data. However, when specific knowledge is required in a particular domain, the model may still generate bland or inappropriate responses, and the interpretability of such models is poor. Therefore, in this paper, we propose the KRP-DS model. We design a knowledge module that incorporates a knowledge graph as external knowledge in the dialogue system. The module utilizes contextual information for path reasoning and guides knowledge prediction. Finally, the predicted knowledge is used to enhance response generation. Experimental results show that our proposed model can effectively improve the quality and diversity of responses while having better interpretability, and outperforms baseline models in both automatic and human evaluations. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#835
-
He 2021
Towards Solving the Winograd Schema Challenge: Model-Free, Model-Based and a Spectrum in Between
14th International Conference on Knowledge Science, Engineering, and Management (KSEM) 2021;12816():126-138 Tokyo, JAPAN Springer International Publishing Ag 2021 DOI: 10.1007/978-3-030-82147-0_11 · Ref ID: 3689 The Winograd Schema Challenge (WSC) has attracted much attention recently as common sense is recognized to be not only the key to human-level intelligence but also a bottleneck faced by recent progress. Although neural language models (LMs) have achieved state-of-the-art (SOTA) performance on WSC, they fall short on interpretability and robustness against adversarial attacks. Contrarily, methods with structured representation and explicit reasoning suffer from the difficulty of knowledge acquisition and the rigidness of representation. In this paper, we look back on the current model-free and model-based approaches, pointing out the missing ingredients towards solving the WSC. We report our preliminary exploration of formalizing the WSC problems using a variant of first-order language and our first-hand findings of indispensable capabilities of human-level commonsense reasoning. The issues we encounter suggest that a full spectrum of representation tools and reasoning abilities are called for. |
Kwesi
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3966
-
Heim 2024
Using Large Language Models to Generate Authentic Multi-agent Knowledge Work Datasets
arXiv 2024;(): 2024 Ref ID: 8582 Current publicly available knowledge work data collections lack diversity, extensive annotations, and contextual information about the users and their documents. These issues hinder objective and comparable data-driven evaluations and optimizations of knowledge work assistance systems. Due to the considerable resources needed to collect such data in real-life settings and the necessity of data censorship, collecting such a dataset appears nearly impossible. For this reason, we propose a configurable, multi-agent knowledge work dataset generator. This system simulates collaborative knowledge work among agents producing Large Language Model-generated documents and accompanying data traces. Additionally, the generator captures all background information, given in its configuration or created during the simulation process, in a knowledge graph. Finally, the resulting dataset can be utilized and shared without privacy or confidentiality concerns. This paper introduces our approach's design and vision and focuses on generating authentic knowledge work documents using Large Language Models. Our study involving human raters who assessed 53% of the generated and 74% of the real documents as realistic demonstrates the potential of our approach. Furthermore, we analyze the authenticity criteria mentioned in the participants' comments and elaborate on potential improvements for identified common issues. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2604
-
Helali 2024
KGLiDS: A Platform for Semantic Abstraction, Linking, and Automation of Data Science
2024 IEEE 40th International Conference on Data Engineering (ICDE) 2024;():179-192 2024 DOI: 10.1109/ICDE60146.2024.00021 · Ref ID: 6233 In recent years, we have witnessed the growing interest from academia and industry in applying data science technologies to analyze large amounts of data. In this process, a myriad of artifacts (datasets, pipeline scripts, etc.) are created. However, there has been no systematic attempt to holistically collect and exploit all the knowledge and experiences that are implicitly contained in those artifacts. Instead, data scientists recover information and expertise from colleagues or learn via trial and error. Hence, this paper presents a scalable platform, KGLiDS, that employs machine learning and knowledge graph technologies to abstract and capture the semantics of data science artifacts and their connections. Based on this information, KGLiDS enables various downstream applications, such as data discovery and pipeline automation. Our comprehensive evaluation covers use cases in data discovery, data cleaning, transformation, and AutoML. It shows that KGLiDS is significantly faster with a lower memory footprint than the state-of-the-art systems while achieving comparable or better accuracy. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2898
-
Henson 2012
Semantic Perception: Converting Sensory Observations to Abstractions
An abstraction is a representation of an environment derived from sensor observation data. Generating an abstraction requires inferring explanations from an incomplete set of observations (often from the Web) and updating these explanations on the basis of new information. This process must be fast and efficient. The authors' approach overcomes these challenges to systematically derive abstractions from observations. The approach models perception through the integration of an abductive logic framework called Parsimonious Covering Theory with Semantic Web technologies. The authors demonstrate this approach's utility and scalability through use cases in the healthcare and weather domains. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3342
-
Heo 2024
Do LLMs "know" internally when they follow instructions?
arXiv 2024;(): 2024 Ref ID: 8729 Instruction-following is crucial for building AI agents with large language models (LLMs), as these models must adhere strictly to user-provided constraints and guidelines. However, LLMs often fail to follow even simple and clear instructions. To improve instruction-following behavior and prevent undesirable outputs, a deeper understanding of how LLMs' internal states relate to these outcomes is required. Our analysis of LLM internal states reveal a dimension in the input embedding space linked to successful instruction-following. We demonstrate that modifying representations along this dimension improves instruction-following success rates compared to random changes, without compromising response quality. Further investigation reveals that this dimension is more closely related to the phrasing of prompts rather than the inherent difficulty of the task or instructions. This discovery also suggests explanations for why LLMs sometimes fail to follow clear instructions and why prompt engineering is often effective, even when the content remains largely unchanged. This work provides insight into the internal workings of LLMs' instruction-following, paving the way for reliable LLM agents. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1627
-
Hertling 2021
Matching with Transformers in MELT
CEUR Workshop Proceedings 2021;3063():13-24 CEUR-WS 2021 Ref ID: 5649 One of the strongest signals for automated matching of on- tologies and knowledge graphs are the textual descriptions of the con- cepts. The methods that are typically applied (such as character- or token-based comparisons) are relatively simple, and therefore do not cap- ture the actual meaning of the texts. With the rise of transformer-based language models, text comparison based on meaning (rather than lexical features) is possible. In this paper, we model the ontology matching task as classification problem and present approaches based on transformer models. We further provide an easy to use implementation in the MELT framework which is suited for ontology and knowledge graph matching. We show that a transformer-based filter helps to choose the correct cor- respondences given a high-recall alignment and already achieves a good result with simple alignment post-processing methods.3 Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1346
-
Hitzler 2022
Generalizable Neuro-Symbolic Systems for Commonsense Question Answering
Front. Artif. Intell. Appl. 2022;342():294-310 IOS Press BV 2022 DOI: 10.3233/FAIA210360 · Ref ID: 5581 This chapter illustrates how suitable neuro-symbolic models for language understanding can enable domain generalizability and robustness in downstream tasks. Different methods for integrating neural language models and knowledge graphs are discussed. The situations in which this combination is most appropriate are characterized, including quantitative evaluation and qualitative error analysis on a variety of commonsense question answering benchmark datasets. © 2022 The authors and IOS Press. All rights reserved. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3156
-
Hoang 2024
Semi-automated Construction of Complex Knowledge Base Question Answering Dataset Using Large Language Model
Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2024, Vilnius, Lithuania, September 9–13, 2024, Proceedings, Part V 2024;():230–248 Vilnius, Lithuania Springer-Verlag 2024 DOI: 10.1007/978-3-031-70362-1_14 · Ref ID: 7148 |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1977
-
Hofer 2024
Towards self-configuring Knowledge Graph Construction Pipelines using LLMs - A Case Study with RML
CEUR Workshop Proceedings 2024;3718(): CEUR-WS 2024 Ref ID: 4566 This paper explores using large language models (LLMs) to generate RDF mapping language (RML) files in the RDF turtle format as a key step towards self-configuring RDF knowledge graph construction pipelines. Our case study involves mapping a subset of the Internet Movie Database (IMDB) in JSON format given a target Movie ontology (selection of DBpedia Ontology OWL statements). We define and compute several scores to assess both the generated mapping files and the resulting graph using a manually created reference. Our findings demonstrate the promising potential of the state-of-the-art commercial LLMs in a zero-shot scenario. © 2024 Copyright for this paper by its authors. |
Ishan
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3752
-
Hogan 2022
An Overview of Distant Supervision for Relation Extraction with a Focus on Denoising and Pre-training Methods
arXiv 2022;(): 2022 Ref ID: 7564 Relation Extraction (RE) is a foundational task of natural language processing. RE seeks to transform raw, unstructured text into structured knowledge by identifying relational information between entity pairs found in text. RE has numerous uses, such as knowledge graph completion, text summarization, question-answering, and search querying. The history of RE methods can be roughly organized into four phases: pattern-based RE, statistical-based RE, neural-based RE, and large language model-based RE. This survey begins with an overview of a few exemplary works in the earlier phases of RE, highlighting limitations and shortcomings to contextualize progress. Next, we review popular benchmarks and critically examine metrics used to assess RE performance. We then discuss distant supervision, a paradigm that has shaped the development of modern RE methods. Lastly, we review recent RE works focusing on denoising and pre-training methods. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2747
-
Hogl 2001
On supporting medical quality with intelligent data mining
Proceedings of the 34th Annual Hawaii International Conference on System Sciences 2001;():10 pp. 2001 DOI: 10.1109/HICSS.2001.926557 · Ref ID: 6308 The healthcare sector is currently facing both the economic necessity and the technical opportunity of a data based approach to quality management. Against this background, we introduce a process model for a data based medical quality management and apply intelligent data mining methods to patient data. Intelligent data mining incorporates advantages of both knowledge acquisition from data and from experts. We present the Knowledge Discovery Question Language (KDQL), a controlled language for business questions which abstracts from database and data mining terminology to allow high-level interaction. We use a knowledge-based measurement of relevant subjective interestingness facets like novelty, usefulness, and understandability which enables flexible ways to access the results of data mining. Questions asked in this project were targeted on diagnostic and therapeutic measures as well as the quality of documentation. For these issues in the field of medical quality management interesting results were found. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#358
-
Hong 2021
Improving Relation Extraction by Knowledge Representation Learning
IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI) 2021;():1211-1215 Electr Network Ieee Computer Soc 2021 DOI: 10.1109/ictai52525.2021.00191 · Ref ID: 3672 Relation extraction is an important NLP task to extract the semantic relationship between two entities. Recently, large-scale pre-training language models have achieved excellent performance in many NLP applications. Most of the existing relation extraction models mainly rely on context information, but entity information is also very important for relation extraction, especially domain knowledge of entity and the direction between entity pairs. In this paper, based on the pre-trained BERT model, we propose a multi-task joint relation extraction model incorporating knowledge representation learning(KRL). The experimental results on the SemEval 2010 task 8 dataset and the KBP37 dataset show that our proposed model outperforms most of state-of-the-art methods. The results on the larger dataset FewRe180 refined from FewRel also indicate that increasing the knowledge representation learning as an auxiliary objective is helpful for the relation extraction task. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2717
-
Hoogs 2001
Multi-modal fusion for video understanding
Proceedings 30th Applied Imagery Pattern Recognition Workshop (AIPR 2001). Analysis and Understanding of Time Varying Imagery 2001;():103-108 2001 DOI: 10.1109/AIPR.2001.991210 · Ref ID: 6932 The exploitation of semantic information in computer vision problems can be difficult because of the large difference in representations and levels of knowledge. Image analysis is formulated in terms of low-level features describing image structure and intensity, while high-level knowledge such as purpose and common sense are encoded in abstract, non-geometric representations. In this work we attempt to bridge this gap through the integration of image analysis algorithms with WordNet, a large semantic network that explicitly links related words in a hierarchical structure. Our problem domain is the understanding of broadcast news, as this provides both linguistic information in the transcript and video information. Visual detection algorithms such as face detection and object tracking are applied to the video to extract basic object information, which is indexed into WordNet. The transcript provides topic information in the form of detected keywords. Together, both types of information are used to constrain a search within WordNet for a description of the video content in terms of the most likely WordNet concepts. This project is in its early stages; the general ideas and concepts are presented here. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1406
-
Hoppe 2022
Improving Zero-Shot Text Classification with Graph-based Knowledge Representations
CEUR Workshop Proceedings 2022;3165(): CEUR-WS 2022 Ref ID: 5568 Insufficient training data is a key challenge for text classification. In particular, long-tail class distributions and emerging, new classes do not provide any training data for specific classes. Therefore, such a zeroshot setting must incorporate additional, external knowledge to enable transfer learning by connecting the external knowledge of previously unseen classes to texts. Recent zero-shot text classifier utilize only distributional semantics defined by large language models and based on class names or natural language descriptions. This implicit knowledge contains ambiguities, is not able to capture logical relations nor is it an efficient representation of factual knowledge. These drawbacks can be avoided by introducing explicit, external knowledge. Especially, knowledge graphs provide such explicit, unambiguous, and complementary, domain specific knowledge. Hence, this thesis explores graph-based knowledge as additional modality for zero-shot text classification. Besides a general investigation of this modality, the influence on the capabilities of dealing with domain shifts by including domain-specific knowledge is explored. © 2022 Copyright for this paper by its authors. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3069
-
Hossain 2023
Utilizing GloVe Embeddings for Deep Learning-Based Analysis of Research Paper Abstracts
2023 5th International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA) 2023;():1-6 2023 DOI: 10.1109/HORA58378.2023.10156746 · Ref ID: 6166 Researchers are finding it harder and harder to locate relevant articles as the body of scientific literature expands at an exponential rate. Due to the sheer volume of publications, manual classification and categorization of these articles is no longer possible. By addressing the task of accurately classifying research papers based on their abstracts, the purpose of this paper is to address the task of improving the proposal and search procedures for efficient academic information retrieval. While ordering papers in computer science, mathematics, physics, and statistics, the models accomplish high precision, accuracy, recall, and F1-score by using profound learning calculations (LSTM, GRU, Bi-LSTM, and Bi-GRU) and GloVe word embeddings to catch semantic data. The LSTM, Bi-LSTM, GRU and Bi-GRU models were used to accurately classify the abstracts of research papers into computer science, mathematics, physics, and statistics. These models performed well concerning accuracy, recall, and F1-score, as well as accomplishing high precision. Automatic categorization of research papers was made possible by combining GloVe word embeddings with deep learning algorithms, which sped up information search and knowledge discovery. These models can help academic researchers and practitioners streamline the process of categorizing research papers and boost their research efforts. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#60
-
Hou 2020
BERT-Based Chinese Relation Extraction for Public Security
The past few years have witnessed some public safety incidents occurring around the world. With the advent of the big data era, effectively extracting public security information from the internet has become of great significance. Up to hundreds of TBs of data are injected into the network every second, and thus it is impossible to process them manually. Natural Language Processing (NLP) is dedicated to the development of an intelligent system for effective text information mining. By analysing the text and quickly extracting the relationships between the relevant entities, NLP can establish the knowledge graph (KG) of public security, which lays the foundation for safety case analysis, information monitoring, and activity tracking and locating. One of the current pre-training relation extraction models is the Word2Vec model. The Word2vec model is single mapped, and it produces a static, single representation of the words in sentences. Then, the BERT model considers contextual information and provides more dynamic, richer vector representations of generated words. Therefore, in this paper, we propose a Bidirectional Encoder Representation from Transformers (BERT) based on the Chinese relation extraction algorithm for public security, which can effectively mine security information. The BERT model is obtained by training the Masked Language Model and predicting the next sentence task, which is based on the Transformer Encoder and the main model structure is the stacked Transformers. Extensive simulations are conducted to evaluate our proposed algorithm in comparison to some state-of-the-art schemes. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2029
-
Hou 2022
What Has Been Enhanced in my Knowledge-Enhanced Language Model?
Findings of the Association for Computational Linguistics: EMNLP 2022 2022;():1417-1438 Association for Computational Linguistics (ACL) 2022 Ref ID: 5482 A number of knowledge integration (KI) methods have recently been proposed to incorporate external knowledge into pretrained language models (LMs). Even though knowledge-enhanced LMs outperform base LMs on knowledge-intensive tasks, the inner-workings of these KI methods are not well-understood. For instance, it is unclear which knowledge is effectively integrated into knowledge-enhanced LMs and which is not; and if such integration leads to catastrophic forgetting of already learned knowledge. We show that existing model interpretation methods such as linear probes and prompts have some key limitations in answering these questions. We revisit KI from an information-theoretic view and propose a new theoretically sound probe called Graph Convolution Simulator (GCS) for KI interpretation. GCS uses graph attention on the corresponding knowledge graph for interpretation. In our experiments we verify that GCS can provide reasonable interpretation results for two well-known knowledge-enhanced LMs: ERNIE and K-Adapter. We also find that only a marginal amount of knowledge is successfully integrated in these models, and simply increasing the size of the KI corpus may not lead to better knowledge-enhanced LMs. © 2022 Association for Computational Linguistics. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#66
-
Hou 2024
Bibliometric Analysis on the Research of Geoscience Knowledge Graph (GeoKG) from 2012 to 2023
The geoscience knowledge graph (GeoKG) has gained worldwide attention due to its ability in the formal representation of spatiotemporal features and relationships of geoscience knowledge. Currently, a quantitative review of the state and trends in GeoKG is still scarce. Thus, a bibliometric analysis was performed in this study to fill the gap. Specifically, based on 294 research articles published from 2012 to 2023, we conducted analyses in terms of the (1) trends in publications and citations; (2) identification of the major papers, sources, researchers, institutions, and countries; (3) scientific collaboration analysis; and (4) detection of major research topics and tendencies. The results revealed that the interest in GeoKG research has rapidly increased after 2019 and is continually expanding. China is the most productive country in this field. Co-authorship analysis shows that inter-national and inter-institutional collaboration should be reinforced. Keyword analysis indicated that geoscience knowledge representation, information extraction, GeoKG construction, and GeoKG-based multi-source data integration were current hotspots. In addition, several important but currently neglected issues, such as the integration of Large Language Models, are highlighted. The findings of this review provide a systematic overview of the development of GeoKG and provide a valuable reference for future research. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2274
-
Hsiao 2007
Constructing Human Brain-Function Association Models from fMRI Literature
2007 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society 2007;():1188-1191 2007 DOI: 10.1109/IEMBS.2007.4352509 · Ref ID: 6272 Toward the goal of understanding the human brain function, we have developed a web-based human brain functional mapping knowledge base (HBFMKB) system to mining human brain-function association model from vast Medline abstracts. Since nomenclature and relationships among cognitive functions have no consensus yet, we use rule-based natural language processing methods to extract behavioral task and cognitive function and do n-gram approximate concept mapping by the unified medical language system (UMLS) knowledge source. The HBFMKB system has an automatic PubMed MEDLINE download and import system, name entity extraction system and interactive visualization system. In summary, the HBFMKB system helps scientists to get digest knowledge before design experiments and compare their results with current literature. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3600
-
Hu 2024
Knowledge in Superposition: Unveiling the Failures of Lifelong Knowledge Editing for Large Language Models
arXiv 2024;(): 2024 Ref ID: 8529 Knowledge editing aims to update outdated or incorrect knowledge in large language models (LLMs). However, current knowledge editing methods have limited scalability for lifelong editing. This study explores the fundamental reason why knowledge editing fails in lifelong editing. We begin with the closed-form solution derived from linear associative memory, which underpins state-of-the-art knowledge editing methods. We extend the solution from single editing to lifelong editing, and through rigorous mathematical derivation, identify an interference term in the final solution, suggesting that editing knowledge may impact irrelevant knowledge. Further analysis of the interference term reveals a close relationship with superposition between knowledge representations. When knowledge superposition does not exist in language models, the interference term vanishes, allowing for lossless knowledge editing. Experiments across numerous language models reveal that knowledge superposition is universal, exhibiting high kurtosis, zero mean, and heavy-tailed distributions with clear scaling laws. Ultimately, by combining theory and experiments, we demonstrate that knowledge superposition is the fundamental reason for the failure of lifelong editing. Moreover, this is the first study to investigate knowledge editing from the perspective of superposition and provides a comprehensive observation of superposition across numerous real-world language models. Code available at https://github.com/ChenhuiHu/knowledge_in_superposition. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2000
-
Hu 2023
Two-stage open information extraction method for the defence technology field
Qinghua Daxue Xuebao 2023;63(9):1309-1316 2023 DOI: 10.16511/j.cnki.qhdxxb.2023.21.010 · Ref ID: 5298 [Objective] The abundant information resources available on the internet about defense technology are of vital importance as data sources for obtaining high-value military intelligence. The aim of open information extraction in the field of defense technology is to extract structured triplets containing subject, predicate, object, and other arguments from the massive amount of information available on the internet. This technology has important implications for ontology induction and the construction of knowledge graphs in the defense technology domain. However, while information extraction experiments in the general domain yield good results, open information extraction in the defense technology domain faces several challenges, such as a lack of domain annotated data, arguments overlapping unadaptability, and unrecognizable long entities.[Methods] In this paper, an annotation strategy is proposed based on the entity boundaries, and an annotated dataset in the defense technology field combined with the experience of domain experts was constructed. Furthermore, a two-stage open information extraction method is proposed in the defense technology field that utilizes a pretrained language model-based sequence labeling algorithm to extract predicates and a multihead attention mechanism to learn the prediction of argument boundaries. In the first stage, the input sentence was converted into an input sequence < [CLS], input sentence [SEP] >, and the input sequence was encoded using a pretrained language model to obtain an implicit state representation of the input sequence. Based on this sentence representation, a conditional random field (CRF) layer was used to predict the position of the predicates, i. e., to predict the BIO labels of the words. In the second stage, the predicated predicates from the first stage were concatenated with the original sentence and converted into an input sequence < [CLS], predicate [SEP], and input sentence [SEP] >, which was encoded using a pretrained language model to obtain an implicit state representation of the input sequence. This representation was then fed to a multihead pointer network to predict the position of the argument. The predicted position was tagged with the actual position to calculate the cross-entropy loss function. Finally, the predicates and the arguments predicted by the predicate and argument extraction models were combined to obtain the complete triplet.[Results] The experimental results from the extensive experiments conducted on a self-built annotated dataset in the defense technology field reveal the following. (1) In predicate extraction, our method achieved a 3. 92% performance improvement in the Fl value as compared to LSTM methods and more than 10% performance improvement as compared to syntactic analysis methods. (2) In argument extraction, our method achieved a considerable performance improvement of more than 16% in the Fl value as compared to LSTM methods and about 11% in the Fl value as compared to the BERT + CRF method.[Conclusions] The proposed two-stage open information extraction method can overcome the challenge of arguments overlapping unadaptability and the difficulty of long-span entity extraction, thus improving the shortcomings of existing open information extraction methods. Extensive experimental analysis conducted on the self-built annotated dataset proved the effectiveness of the proposed method. © 2023 Press of Tsinghua University. All rights reserved. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#204
-
Hu 2023
An empirical study of pre-trained language models in simple knowledge graph question answering
Large-scale pre-trained language models (PLMs) such as BERT have recently achieved great success and become a milestone in natural language processing (NLP). It is now the consensus of the NLP community to adopt PLMs as the backbone for downstream tasks. In recent works on knowledge graph question answering (KGQA), BERT or its variants have become necessary in their KGQA models. However, there is still a lack of comprehensive research and comparison of the performance of different PLMs in KGQA. To this end, we summarize two basic KGQA frameworks based on PLMs without additional neural network modules to compare the performance of nine PLMs in terms of accuracy and efficiency. In addition, we present three benchmarks for larger-scale KGs based on the popular SimpleQuestions benchmark to investigate the scalability of PLMs. We carefully analyze the results of all PLMs-based KGQA basic frameworks on these benchmarks and two other popular datasets, WebQuestionSP and FreebaseQA, and find that knowledge distillation techniques and knowledge enhancement methods in PLMs are promising for KGQA. Furthermore, we test ChatGPT (https://chat.openai.com/), which has drawn a great deal of attention in the NLP community, demonstrating its impressive capabilities and limitations in zero-shot KGQA. We have released the code and benchmarks to promote the use of PLMs on KGQA (https://github.com/aannonymouuss/PLMs-in-Practical-KBQA). |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3217
-
Hu 2024
Benchmarking Large Language Models in Complex Question Answering Attribution using Knowledge Graphs
arXiv 2024;(): 2024 Ref ID: 8049 The attribution of question answering is to provide citations for supporting generated statements, and has attracted wide research attention. The current methods for automatically evaluating the attribution, which are often based on Large Language Models (LLMs), are still inadequate, particularly in recognizing subtle differences between attributions, and complex relationships between citations and statements. To compare these attribution evaluation methods and develop new ones, we introduce a set of fine-grained categories (i.e., supportive, insufficient, contradictory and irrelevant) for measuring the attribution, and develop a Complex Attributed Question Answering (CAQA) benchmark by leveraging knowledge graphs (KGs) for automatically generating attributions of different categories to question-answer pairs. Our analysis reveals that existing evaluators perform poorly under fine-grained attribution settings and exhibit weaknesses in complex citation-statement reasoning. Our CAQA benchmark, validated with human annotations, emerges as a promising tool for selecting and developing LLM attribution evaluators. |
Ishan
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2049
-
Hu 2024
ZipZap: Efficient Training of Language Models for Large-Scale Fraud Detection on Blockchain
WWW 2024 - Proceedings of the ACM Web Conference 2024;():2807-2816 Association for Computing Machinery, Inc 2024 DOI: 10.1145/3589334.3645352 · Ref ID: 4071 Language models (LMs) have demonstrated superior performance in detecting fraudulent activities on Blockchains. Nonetheless, the sheer volume of Blockchain data results in excessive memory and computational costs when training LMs from scratch, limiting their capabilities to large-scale applications. In this paper, we present ZipZap, a framework tailored to achieve both parameter and computational efficiency when training LMs on large-scale transaction data. First, with the frequency-aware compression, an LM can be compressed down to a mere 7.5% of its initial size with an imperceptible performance dip. This technique correlates the embedding dimension of an address with its occurrence frequency in the dataset, motivated by the observation that embeddings of low-frequency addresses are insufficiently trained and thus negating the need for a uniformly large dimension for knowledge representation. Second, ZipZap accelerates the speed through the asymmetric training paradigm: It performs transaction dropping and cross-layer parameter-sharing to expedite the pre-training process, while revert to the standard training paradigm for fine-tuning to strike a balance between efficiency and efficacy, motivated by the observation that the optimization goals of pre-training and fine-tuning are inconsistent. Evaluations on real-world, large-scale datasets demonstrate that ZipZap delivers notable parameter and computational efficiency improvements for training LMs. Our implementation is available at: https://github.com/git-disl/ZipZap. © 2024 Owner/Author. |
Mike
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2120
-
Hu 2018
Answering Natural Language Questions by Subgraph Matching over Knowledge Graphs (Extended Abstract)
2018 IEEE 34th International Conference on Data Engineering (ICDE) 2018;():1815-1816 2018 DOI: 10.1109/ICDE.2018.00265 · Ref ID: 6116 RDF question/answering (Q/A) allows users to ask questions in natural languages over a knowledge base represented by RDF. To answer a natural language question, the existing works focus on question understanding to deal with the disambiguation of phrases linking, which ignore the query composition and execution. In this paper, we propose a systematic framework to answer natural language questions over RDF repository (RDF Q/A) from a graph data-driven perspective. We propose the (super) semantic query graph to model the query intention in the natural language question in a structural way, based on which, RDF Q/A is reduced to subgraph matching problem. More importantly, we resolve the ambiguity both of phrases and structures at the time when matches of query are found. To build the super semantic query graph, we propose a node-first framework which has high robustness and can tackle with complex questions. Extensive experiments confirm that our method not only improves the precision but also speeds up query performance greatly. |
Ishan
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#115
-
Hu 2024
Combining ChatGPT and knowledge graph for explainable machine learning-driven design: a case study
Machine learning has been widely used in design activities, enabling more informed decision-making. However, high-performance machine learning models, often referred to as 'black-box', result in a lack of explainability regarding predictions. The absence of explainability erodes the trust between designers and these models and hinders human-machine collaboration for desirable design decisions. Explainable AI focuses on creating explanations that are accessible and comprehensible to stakeholders, thereby improving explainability. A recent advancement in the field of explainable AI involves leveraging domain-specific knowledge via knowledge graph. Additionally, the advent of large language models like ChatGPT, acclaimed for their ability to output domain knowledge, perform complex language processing, and support seamless end-user interaction, has the potential to expand the horizons of explainable AI. Inspired by these developments, we propose the novel hybrid method that synergizes ChatGPT and knowledge graph to augment post-hoc explainability in design context. The outcome is the generation of more contextual and meaningful explanations, with the added possibility of further interaction to uncover deeper insights. The effectiveness of the proposed method is illustrated through a case study on customer segmentation. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#536
-
Hu 2024
LLM-TIKG: Threat intelligence knowledge graph construction utilizing large language model
Open-source threat intelligence is often unstructured and cannot be directly applied to the next detection and defense. By constructing a knowledge graph through open-source threat intelligence, we can better apply this information to intrusion detection. However, the current methods for constructing knowledge graphs face limitations due to the domain-specific attributes of entities and the analysis of lengthy texts, and they require large amounts of labeled data. Furthermore, there is a lack of authoritative open-source annotated threat intelligence datasets, which require significant manual effort. Moreover, it is noteworthy that current research often neglects the textual descriptions of attack behaviors, resulting in the loss of vital information to understand intricate cyber threats. To address these issues, we propose LLM-TIKG that applies the large language model to construct a knowledge graph from unstructured open-source threat intelligence. The few-shot learning capability of GPT is leveraged to achieve data annotation and augmentation, thereby creating the datasets for fine-tuning a smaller language model (7B). Using the fine-tuned model, we perform topic classification on the collected reports, extract entities and relationships, and extract TTPs from the attack description. This process results in the construction of a threat intelligence knowledge graph, enabling automated and universal analysis of textualized threat intelligence. The experimental results demonstrate improved performance in both named entity recognition and TTP classification, achieving the precision of 87.88% and 96.53%, respectively. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#679
-
Hu 2023
PROMPTCAP: Prompt-Guided Image Captioning for VQA with GPT-3
IEEE/CVF International Conference on Computer Vision (ICCV) 2023;():2951-2963 Paris, FRANCE Ieee Computer Soc 2023 DOI: 10.1109/iccv51070.2023.00277 · Ref ID: 3770 Knowledge-based visual question answering (VQA) involves questions that require world knowledge beyond the image to yield the correct answer. Large language models (LMs) like GPT-3 are particularly helpful for this task because of their strong knowledge retrieval and reasoning capabilities. To enable LM to understand images, prior work uses a captioning model to convert images into text. However, when summarizing an image in a single caption sentence, which visual entities to describe are often underspecified. Generic image captions often miss visual details essential for the LM to answer visual questions correctly. To address this challenge, we propose PROMPTCAP (Prompt-guided image Captioning), a captioning model designed to serve as a better connector between images and black-box LMs. Different from generic captions, PROMPTCAP takes a naturallanguage prompt to control the visual entities to describe in the generated caption. The prompt contains a question that the caption should aid in answering. To avoid extra annotation, PROMPTCAP is trained by examples synthesized with GPT-3 and existing datasets. We demonstrate PROMPT-CAP's effectiveness on an existing pipeline in which GPT-3 is prompted with image captions to carry out VQA. PROMPT-CAP outperforms generic captions by a large margin and achieves state-of-the-art accuracy on knowledge-based VQA tasks (60.4% on OK-VQA and 59.6% on A-OKVQA). Zeroshot results on WebQA show that PROMPTCAP generalizes well to unseen domains.(1) |
Kwesi
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1053
-
Hu 2022
Can Pretrained Language Models Reason on Sparse Commonsense Knowledge Graph?
2022 IEEE 8th International Conference on Computer and Communications, ICCC 2022 2022;():2016-2022 Institute of Electrical and Electronics Engineers Inc. 2022 DOI: 10.1109/ICCC56324.2022.10065877 · Ref ID: 5317 Commonsense knowledge is the knowledge shared by most humans, which is always stored in commonsense knowledge graph (CKG) as triplets. In this paper, we focus on the task of CKG competition, whose target is to predict the tail (head) target given the head (tail) entity and the relation. Most existing works employ the graph-based models, which aggregate information from neighboring entities on CKG. Despite their effectiveness, they still suffer from two main weaknesses. Firstly, the semantic relations between head and tail entities are neglected. Secondly, due to the sparsity of CKG, they rely on the graph densification that it will bring unexpected noises. To solve these problems, we propose a unified framework for COmmonSense knowledge graph completion based on BERT, namely COS-BERT. Firstly, we transfer each triplet into a natural sentence. Then, we fine-tune the pretrained language model using the transformed sentences. Finally, we rank the candidates based on the output representation of sentences. Furthermore, we add a pre-filter to obtain a subset of candidates on the inference stage to save unnecessary computation costs. Comprehensive experiments have demonstrated the superiority of COS-BERT over the state-of-the-arts. © 2022 IEEE. |
Xinchen
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3210
-
Hu 2024
Automating Knowledge Discovery from Scientific Literature via LLMs: A Dual-Agent Approach with Progressive Ontology Prompting
arXiv 2024;(): 2024 Ref ID: 8569 To address the challenge of automating knowledge discovery from a vast volume of literature, in this paper, we introduce a novel framework based on large language models (LLMs) that combines a progressive ontology prompting (POP) algorithm with a dual-agent system, named LLM-Duo, designed to enhance the automation of knowledge extraction from scientific articles. The POP algorithm utilizes a prioritized breadth-first search (BFS) across a predefined ontology to generate structured prompt templates and action orders, thereby guiding LLMs to discover knowledge in an automatic manner. Additionally, our LLM-Duo employs two specialized LLM agents: an explorer and an evaluator. These two agents work collaboratively and adversarially to enhance the reliability of the discovery and annotation processes. Experiments demonstrate that our method outperforms advanced baselines, enabling more accurate and complete annotations. To validate the effectiveness of our method in real-world scenarios, we employ our method in a case study of speech-language intervention discovery. Our method identifies 2,421 interventions from 64,177 research articles in the speech-language therapy domain. We curate these findings into a publicly accessible intervention knowledge base that holds significant potential to benefit the speech-language therapy community. |
Kwesi
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#694
-
Hu 2023
A question answering system for assembly process of wind turbines based on multi-modal knowledge graph and large language model
In the field of wind power generation, wind turbines serve as the foundation for harnessing electrical energy. However, the assembly process information for wind turbines is typically dispersed among various modalities such as 3D models, natural text, and images in the form of process documents. The difficulty in effectively utilising historical process knowledge hampers the efficiency of assembly process design and subsequently affects production efficiency. To address this issue, this paper constructs a Multi-modal Process Knowledge Graph for Wind Turbines, named MPKG-WT. Additionally, a wind turbine assembly process question-answering system combining multi-modal knowledge graphs with large language models (LLMs) is proposed to enable efficient utilisation of historical assembly process knowledge. The proposed approach achieves outstanding results when compared with other state-of-the-art KBQA methods and recent LLMs using a wind turbine assembly process dataset. The effectiveness of the approach is further validated through a visualised assembly process question-answering system. The research findings demonstrate a significant improvement in assembly process design efficiency. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1226
-
Hu 2024
EEE-QA: Exploring Effective and Efficient Question-Answer Representations
2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():5520-5525 European Language Resources Association (ELRA) 2024 Ref ID: 4546 Current approaches to question answering rely on pre-trained language models (PLMs) like RoBERTa. This work challenges the existing question-answer encoding convention and explores finer representations. We begin with testing various pooling methods compared to using the begin-of-sentence token as a question representation for better quality. Next, we explore opportunities to simultaneously embed all answer candidates with the question. This enables cross-reference between answer choices and improves inference throughput via reduced memory usage. Despite their simplicity and effectiveness, these methods have yet to be widely studied in current frameworks. We experiment with different PLMs, and with and without the integration of knowledge graphs. Results prove that the memory efficacy of the proposed techniques with little sacrifice in performance. Practically, our work enhances 38-100% throughput with 26-65% speedups on consumer-grade GPUs by allowing for considerably larger batch sizes. Our work sends a message to the community with promising directions in both representation quality and efficiency for the question-answering task in natural language processing. © 2024 ELRA Language Resource Association: CC BY-NC 4.0. |
Srividya
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1559
-
Huang 2024
Large Foundation Models for Power Systems
IEEE Power and Energy Society General Meeting 2024;(): IEEE Computer Society 2024 DOI: 10.1109/PESGM51994.2024.10688670 · Ref ID: 4124 Foundation models, such as Large Language Models (LLMs), can respond to a wide range of format-free queries without any task-specific data collection or model training, creating various research and application opportunities for the modeling and operation of large-scale power systems. In this paper, we outline how such large foundation model such as GPT-4 are developed, and discuss how they can be leveraged in challenging power and energy system tasks. We first investigate the potential of existing foundation models by validating their performance on four representative tasks across power system domains, including the optimal power flow (OPF), electric vehicle (EV) scheduling, knowledge retrieval for power engineering technical reports, and situation awareness. Our results indicate strong capabilities of such foundation models on boosting the efficiency and reliability of power system operational pipelines. We also provide suggestions and projections on future deployment of foundation models in power system applications. © 2024 IEEE. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#482
-
Huang 2023
KOSA: KO Enhanced Salary Analytics based on Knowledge Graph and LLM Capabilities
23rd IEEE International Conference on Data Mining (IEEE ICDM) 2023;():499-505 Shanghai, PEOPLES R CHINA Ieee Computer Soc 2023 DOI: 10.1109/icdmw60847.2023.00071 · Ref ID: 2973 Knowledge base question answering (KBQA) is designed to respond to natural language inquiries by utilizing factual information, such as entities, relationships, and attributes, derived from a knowledge base (KB). The advent of large language models (LLMs) has significantly boosted the performance of KBQA, owing to their exceptional capabilities in content comprehension and generation. In this paper, we present a Knowledge Ocean enhanced Salary Analytics (KOSA) system based on knowledge graphs and LLMs tailored to employee salary data from a public university. This system encompasses an interactive conversational interface, visualization of knowledge graphs, and advanced data analysis. By employing the framework of knowledge engineering, we enable knowledge graph modeling, Cypher (the query engine of Neo4j) reasoning, and question answering functionalities. Furthermore, machine learning algorithms are integrated to facilitate advanced features, such as salary prediction and allocation. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3980
-
Huang 2024
VLKEB: A Large Vision-Language Model Knowledge Editing Benchmark
arXiv 2024;(): 2024 Ref ID: 8177 Recently, knowledge editing on large language models (LLMs) has received considerable attention. Compared to this, editing Large Vision-Language Models (LVLMs) faces extra challenges from diverse data modalities and complicated model components, and data for LVLMs editing are limited. The existing LVLM editing benchmark, which comprises three metrics (Reliability, Locality, and Generality), falls short in the quality of synthesized evaluation images and cannot assess whether models apply edited knowledge in relevant content. Therefore, we employ more reliable data collection methods to construct a new Large $\textbf{V}$ision-$\textbf{L}$anguage Model $\textbf{K}$nowledge $\textbf{E}$diting $\textbf{B}$enchmark, $\textbf{VLKEB}$, and extend the Portability metric for more comprehensive evaluation. Leveraging a multi-modal knowledge graph, our image data are bound with knowledge entities. This can be further used to extract entity-related knowledge, which constitutes the base of editing data. We conduct experiments of different editing methods on five LVLMs, and thoroughly analyze how do they impact the models. The results reveal strengths and deficiencies of these methods and hopefully provide insights for future research. The codes and dataset are available at: $\href{https://github.com/VLKEB/VLKEB}{\text{https://github.com/VLKEB/VLKEB}}$. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#785
-
Huang 2024
SSNF: Optimizing Entity Alignment with a Novel Structural and Semantic Neighbor Filtering
17th International Conference on Knowledge Science, Engineering and Management (KSEM) 2024;14885():180-191 Birmingham, ENGLAND Springer-Verlag Singapore Pte Ltd 2024 DOI: 10.1007/978-981-97-5495-3_13 · Ref ID: 3393 In the domain of Knowledge Graphs (KGs), the alignment of entities is pivotal, aiming to identify and match equivalent entities across distinct KGs. Existing methodologies primarily aggregate information from direct neighbors via graph neural networks, a process which can inadvertently introduce noise. To address this challenge, we introduce SSNF, an innovative neighbor filtering mechanism that optimally balances structural and semantic information, crucial for accurate entity alignment. By employing motifs for structural assessment and leveraging Large Language Models (LLMs) for semantic analysis with 'Reasoning-Challenging (Re-Cha)' strategy to query LLMs to determine important neighbors. This dual-focus strategy mitigates the inclusion of less informative neighbors. When integrated with existing Entity Alignment (EA) frameworks, our approach demonstrates superior efficacy, significantly outperforming conventional methods through meticulous neighbor selection. The extensive experiments, conducted on the most widely used benchmark datasets (i.e., DBP15K), exhibit a significant improvement in EA performance, demonstrating its potential to advance the field of KG entity alignment by synergizing structural insights and semantic precision. |
Davis
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1827
-
Huang 2020
Review of Deep Learning-Based Topic Model
As a research hotspot for more than twenty years, topic model plays an important role in semantic analysis of multi-documents.The topic model is adept in extracting groups of keywords from documents to represent their core idea,and thus provides crucial support for document classification, information retrieval, automatic summarization of multi-documents, sentiment analysis and so on. Conventional topic model based on three-layers Bayesian network has been well studied in the past ten years. However, combining with deep learning techniques makes topic model grow new lease of life in recent years due to the wide applications of deep learning in natural language processing, such as word embeddings training, text generation and knowledge graph building. In deep-learning-based topic models, it has become a major task of designing a more accurate and effective model by introducing advanced ideas and techniques from deep learning, such as word embeddings, neural network (e.g., recurrent neural network, RNN), variational auto-encoder (VAE) and knowledge graph. In this review, we first comparatively discuss four probabilistic topic models and two sparse additive topic models from model assumption, document generation process and parameter inference. There are latent Dirichlet allocation (LDA), Dirichlet multinomial mixture model (DMM), biterm topic model (BTM), sparse topical coding (STC) and sparse additive generative model (SAGM) respectively. The above six models are the typical representations of the conventional topic model and have various improvement versions and applications since they have been proposed. Then, we introduce the latest research progress of deep-learning-based topic models in detail, which can be summed up as three different types of models. The first type of the model is named word-embedding-based probabilistic topic model, which improves one of the conventional topics model (e.g., LDA, DMM or BTM) with auxiliary pre-trained word embeddings while still complying with the basic assumption of the original model. In these models, word embeddings that pre-trained from large volume of corpus like Wikipedia are introduced to evaluate the similarity between word pair. Based on the evaluation, similar words are more likely to be assigned to the same topic during topic sampling process, and thus the topic coherence and text classification accuracy are improved eventually. The second type of the model is named neural-network-based topic model, which employs neural network structure, such as Multilayer Perceptron (MLP) or RNN, to model the document generation process with introducing latent topic structure. In these models, bag-of-words of a text is feeded into neural topic model and transferred into embeddings, then topic distribution and topic-word distribution are inferred out by the neural network. To further improve the performance of the neural topic model, VAE is employed to transfer the text embeddings into latent space before topic inference process, and sparsity constraint of topic-word distribution is enforced into the model to generate more expressive topical words. The third type of the model is named jointly training model of topic and language, which can train a topic model and language model simultaneously. In these models, token sequence of a text is feeded into a neural network to generate text with the guidance of latent topics. Furthermore, we summarize the public datasets (e.g., 20NewsGroups) and evaluation metrics (e.g., Pointwise Mutual Information) used in above topic models. Finally, we end up with discussing some potential trends of topic model's future development. © 2020, Science Press. All right reserved. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#882
-
Huang 2023
WERECE: An Unsupervised Method for Educational Concept Extraction Based on Word Embedding Refinement
The era of educational big data has sparked growing interest in extracting and organizing educational concepts from massive amounts of information. Outcomes are of the utmost importance for artificial intelligence-empowered teaching and learning. Unsupervised educational concept extraction methods based on pre-trained models continue to proliferate due to ongoing advances in semantic representation. However, it remains challenging to directly apply pre-trained large language models to extract educational concepts; pre-trained models are built on extensive corpora and do not necessarily cover all subject-specific concepts. To address this gap, we propose a novel unsupervised method for educational concept extraction based on word embedding refinement (i.e., word embedding refinement-based educational concept extraction (WERECE)). It integrates a manifold learning algorithm to adapt a pre-trained model for extracting educational concepts while accounting for the geometric information in semantic computation. We further devise a discriminant function based on semantic clustering and Box-Cox transformation to enhance WERECE's accuracy and reliability. We evaluate its performance on two newly constructed datasets, EDU-DT and EDUTECH-DT. Experimental results show that WERECE achieves an average precision up to 85.9%, recall up to 87.0%, and F1 scores up to 86.4%, which significantly outperforms baselines (TextRank, term frequency-inverse document frequency, isolation forest, K-means, and one-class support vector machine) on educational concept extraction. Notably, when WERECE is implemented with different parameter settings, its precision and recall sensitivity remain robust. WERECE also holds broad application prospects as a foundational technology, such as for building discipline-oriented knowledge graphs, enhancing learning assessment and feedback, predicting learning interests, and recommending learning resources. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3850
-
Huang 2024
RiTeK: A Dataset for Large Language Models Complex Reasoning over Textual Knowledge Graphs
arXiv 2024;(): 2024 Ref ID: 8725 Answering complex real-world questions often requires accurate retrieval from textual knowledge graphs (TKGs). The scarcity of annotated data, along with intricate topological structures, makes this task particularly challenging. As the nature of relational path information could enhance the inference ability of Large Language Models (LLMs), efficiently retrieving more complex relational path information from TKGs presents another key challenge. To tackle these challenges, we first develop a Dataset for LLMs Complex Reasoning over Textual Knowledge Graphs (RiTeK) with a broad topological structure coverage.We synthesize realistic user queries that integrate diverse topological structures, relational information, and complex textual descriptions. We conduct rigorous expert evaluation to validate the quality of our synthesized queries. And then, we introduce an enhanced Monte Carlo Tree Search (MCTS) method, Relational MCTS, to automatically extract relational path information from textual graphs for specific queries. Our dataset mainly covers the medical domain as the relation types and entity are complex and publicly available. Experimental results indicate that RiTeK poses significant challenges for current retrieval and LLM systems, while the proposed Relational MCTS method enhances LLM inference ability and achieves state-of-the-art performance on RiTeK. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1513
-
Huang 2020
Knowledge graph-augmented abstractive summarization with semantic-driven cloze reward
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2020;():5094-5107 Association for Computational Linguistics (ACL) 2020 DOI: 10.18653/v1/2020.acl-main.457 · Ref ID: 5777 Sequence-to-sequence models for abstractive summarization have been studied extensively, yet the generated summaries commonly suffer from fabricated content, and are often found to be near-extractive. We argue that, to address these issues, the summarizer should acquire semantic interpretation over input, e.g., via structured representation, to allow the generation of more informative summaries. In this paper, we present ASGARD, a novel framework for Abstractive Summarization with Graph-Augmentation and semantic-driven RewarD. We propose the use of dual encoders-a sequential document encoder and a graph-structured encoder-to maintain the global context and local characteristics of entities, complementing each other. We further design a reward based on a multiple choice cloze test to drive the model to better capture entity interactions. Results show that our models produce significantly higher ROUGE scores than a variant without knowledge graph as input on both New York Times and CNN/Daily Mail datasets. We also obtain better or comparable performance compared to systems that are fine-tuned from large pretrained language models. Human judges further rate our model outputs as more informative and containing fewer unfaithful errors. © 2020 Association for Computational Linguistics |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#671
-
Huang 2023
PRODIGY: Enabling In-context Learning Over Graphs
37th Conference on Neural Information Processing Systems (NeurIPS) 2023;(): New Orleans, LA Neural Information Processing Systems (Nips) 2023 Ref ID: 3504 In-context learning is the ability of a pretrained model to adapt to novel and diverse downstream tasks by conditioning on prompt examples, without optimizing any parameters. While large language models have demonstrated this ability, how in-context learning could be performed over graphs is unexplored. In this paper, we develop Pretraining Over Diverse In-Context Graph Systems (PRODIGY), the first pretraining framework that enables in-context learning over graphs. The key idea of our framework is to formulate in-context learning over graphs with a novel prompt graph representation, which connects prompt examples and queries. We then propose a graph neural network architecture over the prompt graph and a corresponding family of in-context pretraining objectives. With PRODIGY, the pretrained model can directly perform novel downstream classification tasks on unseen graphs via in-context learning. We provide empirical evidence of the effectiveness of our framework by showcasing its strong in-context learning performance on tasks involving citation networks and knowledge graphs. Our approach outperforms the in-context learning accuracy of contrastive pretraining baselines with hard-coded adaptation by 18% on average across all setups. Moreover, it also outperforms standard finetuning with limited data by 33% on average with in-context learning. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1998
-
Huang 2025
Two Semantic Information Extension Enhancement Methods For Zero-Shot Learning
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2025;15035 LNCS():511-525 Springer Science and Business Media Deutschland GmbH 2025 DOI: 10.1007/978-981-97-8620-6_35 · Ref ID: 3804 In the domain of computer vision, Zero-Shot Learning (ZSL) achieves the classification of unseen class objects through the utilization of semantic information of class relationships. Acquiring richer semantic information and representation pose a significant avenue for enhancing learner performance. Existing studies of ZSL predominately address this challenge only by introducing knowledge graphs and graph neural networks, overlooking inadequacies in the original semantic information, the intrinsic hierarchical and the directional characteristics within the graph structure. This paper proposes two semantic information enhancing methods for ZSL, respectively tailored for regular datasets and large-scale datasets. Facing regular ZSL datasets, our method leverages textual knowledge within large language models, extending traditional 2-dimensional attribute annotations to a 3-dimensional space to obtain more comprehensive class-level semantic information. Addressing the large ZSL tasks, our approach combines enhanced semantic information with external knowledge graphs to simulate class relationships, employing the intrinsic structure and directionality of graphs to bolster semantic representations. We validated our approaches on four traditional ZSL datasets and the ImageNet dataset. The experimental results manifested significant improvements in ZSL performance, underscoring the potential of our methods. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1177
-
Huang 2024
Designing an Interpretable Question Answering System for Vertical Domains Based on Large Language Model and Knowledge Graph
Advances in Transdisciplinary Engineering 2024;57():552-561 IOS Press BV 2024 DOI: 10.3233/ATDE240503 · Ref ID: 3988 Given the low interpretability of large language models (LLMs) due to their extensive parameters and intricate features, this study aims to enhance the understandability and interpretability of automatic QA systems powered by LLMs, thereby addressing a critical gap in the field. To achieve this, we introduce an interpretable architecture for a domain-specific LLM-based question-answering (QA) system. The research decomposes the QA system into six modules: operation recognition, intent recognition, normalization, triplet structured data conversion, knowledge graph querying, and query result processing. Through this approach, the input and output of each module in the QA system are human-readable text data, enhancing the interpretability of the QA system's processing. The use of knowledge graph data increases the credibility of the answers provided by the QA system. The QA system architecture proposed in this study attempts to integrate the powerful natural language understanding capabilities of large language models with the data querying capacity of knowledge graphs, offering a reference for addressing the issue of low interpretability in automatic QA systems based on large language models (LLMs). © 2024 The Authors. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3985
-
Huang 2024
WESE: Weak Exploration to Strong Exploitation for LLM Agents
arXiv 2024;(): 2024 Ref ID: 8226 Recently, large language models (LLMs) have demonstrated remarkable potential as an intelligent agent. However, existing researches mainly focus on enhancing the agent's reasoning or decision-making abilities through well-designed prompt engineering or task-specific fine-tuning, ignoring the procedure of exploration and exploitation. When addressing complex tasks within open-world interactive environments, these methods exhibit limitations. Firstly, the lack of global information of environments leads to greedy decisions, resulting in sub-optimal solutions. On the other hand, irrelevant information acquired from the environment not only adversely introduces noise, but also incurs additional cost. This paper proposes a novel approach, Weak Exploration to Strong Exploitation (WESE), to enhance LLM agents in solving open-world interactive tasks. Concretely, WESE involves decoupling the exploration and exploitation process, employing a cost-effective weak agent to perform exploration tasks for global knowledge. A knowledge graph-based strategy is then introduced to store the acquired knowledge and extract task-relevant knowledge, enhancing the stronger agent in success rate and efficiency for the exploitation task. Our approach is flexible enough to incorporate diverse tasks, and obtains significant improvements in both success rates and efficiency across four interactive benchmarks. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2852
-
Huang 2010
Research on Representation of Geographic Feature Based on Geo-Ontology
2010 2nd International Workshop on Intelligent Systems and Applications 2010;():1-5 2010 DOI: 10.1109/IWISA.2010.5473529 · Ref ID: 6310 In future, GIS will develop in the direction of popular application. Geography information sharing has become a stringent problem needed to resolve. Because people's cognition of real geographic world is different from each other, semantic discrepancy comes into being. As a result, Realization of geography information sharing has firstly to realize GIS's semantic sharing. Geo-ontology provides generally accepted concepts in geographic information domain and explicit formalized definitions thereof, so as to resolve the problem different geographic cognition gives rise to and inter-translation problem of description logic thereof. Geo-ontology can be applied to integration and sharing of geographic information. Start with the process of abstraction and generalization from real geographic world to computer world, this paper analyzes the source of semantic heterogeneity, and introduces geographic feature and its representing generalized geographic objects. Then this paper analyzes relationships between geo-ontology and geographic feature in detail, finally lay emphasis upon research on how to use geo-ontology to represent geographic features. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#596
-
Huang 2023
MVP-Tuning: Multi-View Knowledge Retrieval with Prompt Tuning for Commonsense Reasoning
61st Annual Meeting of the the Association-for-Computational-Linguistics (ACL) 2023;():13417-13432 Toronto, CANADA Assoc Computational Linguistics-Acl 2023 Ref ID: 3753 Recent advances in pre-trained language models (PLMs) have facilitated the development of commonsense reasoning tasks. However, existing methods rely on multi-hop knowledge retrieval and thus suffer low accuracy due to embedded noise in the acquired knowledge. In addition, these methods often attain high computational costs and nontrivial knowledge loss because they encode the knowledge independently of the PLM, making it less relevant to the task and resulting in a poor local optimum. In this work, we propose Multi-View Knowledge Retrieval with Prompt Tuning (MVP-Tuning). Our MVP-Tuning leverages similar question-answer pairs in training set to improve knowledge retrieval and employs a single prompttuned PLM to model knowledge and input text jointly. We conduct our experiments on five commonsense reasoning QA benchmarks to show that MVP-Tuning outperforms all other baselines in 4 out of 5 datasets with only as most 2% trainable parameters. The ensemble of our MVP-Tuning models even gets a new state-of-the-art performance on OpenBookQA and is ranked first place on the leaderboard(1). Our code and data are available(2). |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3805
-
Hussien 2024
RAG-based Explainable Prediction of Road Users Behaviors for Automated Driving using Knowledge Graphs and Large Language Models
arXiv 2024;(): 2024 Ref ID: 8269 Prediction of road users' behaviors in the context of autonomous driving has gained considerable attention by the scientific community in the last years. Most works focus on predicting behaviors based on kinematic information alone, a simplification of the reality since road users are humans, and as such they are highly influenced by their surrounding context. In addition, a large plethora of research works rely on powerful Deep Learning techniques, which exhibit high performance metrics in prediction tasks but may lack the ability to fully understand and exploit the contextual semantic information contained in the road scene, not to mention their inability to provide explainable predictions that can be understood by humans. In this work, we propose an explainable road users' behavior prediction system that integrates the reasoning abilities of Knowledge Graphs (KG) and the expressiveness capabilities of Large Language Models (LLM) by using Retrieval Augmented Generation (RAG) techniques. For that purpose, Knowledge Graph Embeddings (KGE) and Bayesian inference are combined to allow the deployment of a fully inductive reasoning system that enables the issuing of predictions that rely on legacy information contained in the graph as well as on current evidence gathered in real time by onboard sensors. Two use cases have been implemented following the proposed approach: 1) Prediction of pedestrians' crossing actions; 2) Prediction of lane change maneuvers. In both cases, the performance attained surpasses the current state of the art in terms of anticipation and F1-score, showing a promising avenue for future research in this field. |
yuexi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1101
-
Hwang 2021
(COMET-)ATOMIC2020: On Symbolic and Neural Commonsense Knowledge Graphs
35th AAAI Conference on Artificial Intelligence, AAAI 2021 2021;7():6384-6392 Association for the Advancement of Artificial Intelligence 2021 Ref ID: 5655 Recent years have brought about a renewed interest in commonsense representation and reasoning in the field of natural language understanding. The development of new commonsense knowledge graphs (CSKG) has been central to these advances as their diverse facts can be used and referenced by machine learning models for tackling new and challenging tasks. At the same time, there remain questions about the quality and coverage of these resources due to the massive scale required to comprehensively encompass general commonsense knowledge. In this work, we posit that manually constructed CSKGs will never achieve the coverage necessary to be applicable in all situations encountered by NLP agents. Therefore, we propose a new evaluation framework for testing the utility of KGs based on how effectively implicit knowledge representations can be learned from them. With this new goal, we propose ATOMIC2020, a new CSKG of general-purpose commonsense knowledge containing knowledge that is not readily available in pretrained language models. We evaluate its properties in comparison with other leading CSKGs, performing the first large-scale pairwise study of commonsense knowledge resources. Next, we show that ATOMIC2020 is better suited for training knowledge models that can generate accurate, representative knowledge for new, unseen entities and events. Finally, through human evaluation, we show that the few-shot performance of GPT-3 (175B parameters), while impressive, remains ∼12 absolute points lower than a BART-based knowledge model trained on ATOMIC2020 despite using over 430x fewer parameters. Copyright © 2021, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3945
-
Ibáñez 2023
Trust, Accountability, and Autonomy in Knowledge Graph-based AI for Self-determination
arXiv 2023;(): 2023 Ref ID: 7913 Knowledge Graphs (KGs) have emerged as fundamental platforms for powering intelligent decision-making and a wide range of Artificial Intelligence (AI) services across major corporations such as Google, Walmart, and AirBnb. KGs complement Machine Learning (ML) algorithms by providing data context and semantics, thereby enabling further inference and question-answering capabilities. The integration of KGs with neuronal learning (e.g., Large Language Models (LLMs)) is currently a topic of active research, commonly named neuro-symbolic AI. Despite the numerous benefits that can be accomplished with KG-based AI, its growing ubiquity within online services may result in the loss of self-determination for citizens as a fundamental societal issue. The more we rely on these technologies, which are often centralised, the less citizens will be able to determine their own destinies. To counter this threat, AI regulation, such as the European Union (EU) AI Act, is being proposed in certain regions. The regulation sets what technologists need to do, leading to questions concerning: How can the output of AI systems be trusted? What is needed to ensure that the data fuelling and the inner workings of these artefacts are transparent? How can AI be made accountable for its decision-making? This paper conceptualises the foundational topics and research pillars to support KG-based AI for self-determination. Drawing upon this conceptual framework, challenges and opportunities for citizen self-determination are illustrated and analysed in a real-world scenario. As a result, we propose a research agenda aimed at accomplishing the recommended objectives. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3218
-
Ifergan 2024
Beneath the Surface of Consistency: Exploring Cross-lingual Knowledge Representation Sharing in LLMs
arXiv 2024;(): 2024 Ref ID: 8548 The veracity of a factoid is largely independent of the language it is written in. However, language models are inconsistent in their ability to answer the same factual question across languages. This raises questions about how LLMs represent a given fact across languages. We explore multilingual factual knowledge through two aspects: the model's ability to answer a query consistently across languages, and the ability to ''store'' answers in a shared representation for several languages. We propose a methodology to measure the extent of representation sharing across languages by repurposing knowledge editing methods. We examine LLMs with various multilingual configurations using a new multilingual dataset. We reveal that high consistency does not necessarily imply shared representation, particularly for languages with different scripts. Moreover, we find that script similarity is a dominant factor in representation sharing. Finally, we observe that if LLMs could fully share knowledge across languages, their accuracy in their best-performing language could benefit an increase of up to 150% on average. These findings highlight the need for improved multilingual knowledge representation in LLMs and suggest a path for the development of more robust and consistent multilingual LLMs. |
yuexi
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1423
-
Iga 2024
Integrating LLMs with Knowledge Graphs-enhanced Task-Oriented Dialogue Systems
CEUR Workshop Proceedings 2024;3767():40-51 CEUR-WS 2024 Ref ID: 4117 Large Language Models (LLM) have become the state-of-the-art natural language processing systems. Their emergent abilities paved the way for dialogue systems capable of understanding and solving users’ specific tasks, ranging from arithmetic problems to simple chatting, all expressed in natural language. However, for specific domains, research has shown that LLMs cannot directly substitute Task-Oriented Dialogue Systems (TOD). TOD Systems aims to master a specific domain or company, enabling communication by natural language. Thus, this research project focuses on building personalized TODS with the help of artificial intelligence, using LLMs grounded with Temporal Knowledge Graphs. We assess the temporal validity of facts in the KG through temporal timestamps. To capture the dynamics of a company or domain, business processes are modeled with BPMN, offering the possibility of converting them to KGs. Finally, the TOD System will be able to grow a domain-specific KG and reason over it, leveraging LLMs capabilities of solving KG-related tasks. © 2023 Copyright for this paper by its authors. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#539
-
Iga 2024
LLMs for Knowledge-Graphs Enhanced Task-Oriented Dialogue Systems: Challenges and Opportunities
36th International Conference on Advanced Information Systems Engineering (CAiSE) 2024;521():168-179 Limassol, CYPRUS Springer International Publishing Ag 2024 DOI: 10.1007/978-3-031-61003-5_15 · Ref ID: 2981 Large Language Models are a great tool for solving diverse tasks formulated in natural language. Recent work has demonstrated their capacity of solving tasks related to Knowledge Graphs, such as Knowledge Graph Completion or Knowledge Graph Reasoning, even in Zero- or Few-Shot paradigms. However, given a particular input, they do not always produce the same output, and sometimes point to intermediate reasoning steps that are not valid, even if they produce a satisfactorily answer. Moreover, the use of LLMs is mostly studied for static knowledge graphs, while temporal ones are overlooked. To highlight opportunities and challenges in knowledge graph related tasks, we experiment with ChatGPT on graph completion and reasoning for both static and temporal facets, using three different prompting techniques in Zero- and One-Shot contexts, on a Task-Oriented Dialogue system use case. Our results show that ChatGPT can solve given tasks, but mostly in a nondeterministic way. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#151
-
Ilievski 2021
CSKG: The CommonSense Knowledge Graph
18th Extended Semantic Web Conference (ESWC) 2021;12731():680-696 Electr Network Springer International Publishing Ag 2021 DOI: 10.1007/978-3-030-77385-4_41 · Ref ID: 2994 Sources of commonsense knowledge support applications in natural language understanding, computer vision, and knowledge graphs. Given their complementarity, their integration is desired. Yet, their different foci, modeling approaches, and sparse overlap make integration difficult. In this paper, we consolidate commonsense knowledge by following five principles, which we apply to combine seven key sources into a first integrated CommonSense Knowledge Graph (CSKG). We analyze CSKG and its various text and graph embeddings, showing that CSKG is well-connected and that its embeddings provide a useful entry point to the graph. We demonstrate how CSKG can provide evidence for generalizable downstream reasoning and for pre-training of language models. CSKG and all its embeddings are made publicly available to support further research on commonsense knowledge integration and reasoning. |
Xinchen
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1595
-
Incitti 2024
Leveraging LLMs for Knowledge Engineering from Technical Manuals: A Case Study in the Medical Prosthesis Manufacturing Domain
FUSION 2024 - 27th International Conference on Information Fusion 2024;(): Institute of Electrical and Electronics Engineers Inc. 2024 DOI: 10.23919/FUSION59988.2024.10706469 · Ref ID: 4198 Ontologies are nowadays widely used to organize information across specific domains, being effective due to their hierarchical structure and the ability to explicitly represent relationships between concepts. Knowledge engineering, like compiling companies' vast bodies of knowledge into these structures, however, still represents a time-consuming, largely manually performed process, esp. with significant amounts of knowledge often only recorded within unstructured text documents. Since the recently introduced Large Language Models (LLMs) excel on text summarization, this raises the question whether these could be exploited within dedicated knowledge fusion architectures to assist human knowledge engineers by automatically suggesting relevant classes, instances and relations extracted from textual corpora. We therefore propose a novel approach that leverages the taxonomic structure of a partially defined ontology to prompt LLMs for hierarchical knowledge organization. Unlike conventional methods that rely solely on static ontologies, our methodology dynamically generates prompts based on the ontology's existing class taxonomy, prompting the LLM to generate responses that extract supplementary information from unstructured documents. It thus introduces the concept of using ontologies as scaffolds for guiding LLMs, in order to realize a mutual interplay between structured ontological knowledge and the soft fusion capabilities of LLMs. We evaluate our proposed algorithm on a real-world case study, performing a knowledge fusion task on heterogeneous technical documentation from a medical prosthesis manufacturer. © 2024 ISIF. |
Kwesi
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#526
-
Islakoglu 2024
Leveraging Pre-trained Language Models for Time Interval Prediction in Text-Enhanced Temporal Knowledge Graphs
21st International Conference on The Semantic Web (ESWC) 2024;14664():59-78 Hersonissos, GREECE Springer International Publishing Ag 2024 DOI: 10.1007/978-3-031-60626-7_4 · Ref ID: 3119 Most knowledge graph completion (KGC) methods rely solely on structural information, even though a large number of publicly available KGs contain additional temporal (validity time intervals) and textual data (entity descriptions). While recent temporal KGC methods utilize time information to enhance link prediction, they do not leverage textual descriptions or support inductive inference (prediction for entities that have not been seen during training). In this work, we propose a novel framework called TEMT that exploits the power of pre-trained language models (PLMs) for temporal KGC. TEMT predicts time intervals of facts by fusing their textual and temporal information. It also supports inductive inference by utilizing PLMs. In order to showcase the power of TEMT, we carry out several experiments including time interval prediction, both in transductive and inductive settings, and triple classification. The experimental results demonstrate that TEMT is competitive with the state-of-the-art, while also supporting inductiveness. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3675
-
Israelsen 2023
LLMs for Multi-Modal Knowledge Extraction and Analysis in Intelligence/Safety-Critical Applications
arXiv 2023;(): 2023 Ref ID: 7973 Large Language Models have seen rapid progress in capability in recent years; this progress has been accelerating and their capabilities, measured by various benchmarks, are beginning to approach those of humans. There is a strong demand to use such models in a wide variety of applications but, due to unresolved vulnerabilities and limitations, great care needs to be used before applying them to intelligence and safety-critical applications. This paper reviews recent literature related to LLM assessment and vulnerabilities to synthesize the current research landscape and to help understand what advances are most critical to enable use of of these technologies in intelligence and safety-critical applications. The vulnerabilities are broken down into ten high-level categories and overlaid onto a high-level life cycle of an LLM. Some general categories of mitigations are reviewed. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1730
-
Izquierdo-Badiola 2024
PlanCollabNL: Leveraging Large Language Models for Adaptive Plan Generation in Human-Robot Collaboration
Proceedings - IEEE International Conference on Robotics and Automation 2024;():17344-17350 Institute of Electrical and Electronics Engineers Inc. 2024 DOI: 10.1109/ICRA57147.2024.10610055 · Ref ID: 4665 "Hey, robot. Let's tidy up the kitchen. By the way, I have back pain today". How can a robotic system devise a shared plan with an appropriate task allocation from this abstract goal and agent condition? Classical AI task planning has been explored for this purpose, but it involves a tedious definition of an inflexible planning problem. Large Language Models (LLMs) have shown promising generalisation capabilities in robotics decision-making through knowledge extraction from Natural Language (NL). However, the translation of NL information into constrained robotics domains remains a challenge. In this paper, we use LLMs as translators between NL information and a structured AI task planning problem, targeting human-robot collaborative plans. The LLM generates information that is encoded in the planning problem, including specific subgoals derived from an NL abstract goal, as well as recommendations for subgoal allocation based on NL agent conditions. The framework, PlanCollabNL, is evaluated for a number of goals and agent conditions, and the results show that correct and executable plans are found in most cases. With this framework, we intend to add flexibility and generalisation to HRC plan generation, eliminating the need for a manual and laborious definition of restricted planning problems and agent models. © 2024 IEEE. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3540
-
Jain 2024
Integrating Large Language Models with Graph-based Reasoning for Conversational Question Answering
arXiv 2024;(): 2024 Ref ID: 8455 We focus on a conversational question answering task which combines the challenges of understanding questions in context and reasoning over evidence gathered from heterogeneous sources like text, knowledge graphs, tables, and infoboxes. Our method utilizes a graph structured representation to aggregate information about a question and its context (i.e., the conversation so far and evidence retrieved to find an answer), while also harnessing the reasoning and text generation capabilities of large language models (LLMs). Graph embeddings are directly injected into the LLM, bypassing the token embedding layers, and learned end-to-end by minimizing cross-entropy. Our model maintains a memory module to track and update past evidence, thus influencing the graph's structure, as the conversation evolves. Experimental results on the ConvMix benchmark(Christmann et al., 2022a) show that graph embeddings enhance the LLM's ability to reason, while the memory module provides robustness against noise and retrieval errors. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2468
-
Jamal 2015
Formalizing air traffic control system using agent-based Mobile Petri Nets
2015 International Conference on Information and Communication Technologies (ICICT) 2015;():1-6 2015 DOI: 10.1109/ICICT.2015.7469480 · Ref ID: 6812 Agent-based Mobile Petri Net (MPN) is an emerging variant of classical Petri Nets which allows graphical representation of system to be developed. In addition agent-based MPN integrates mobile agent technology for modeling concurrency and mobility. Unified Modeling Language (UML) has become a defacto standard for modeling any real world system. Unlike UML models, MPN are based on mathematical semantics and can be verified for presence of errors and inconsistencies. This paper demonstrates the strength of agent-based MPN to model and verify Air Traffic Control (ATC) which is a complex, highly distributed and safety critical system. Firstly the abstract model of ATC system is introduced by identifying mobile agents like aircraft and controller then the abstract ATC model is transformed into formal ATC model. The three major operations of Takeoff, enroute and landing have been formalized using agent-based MPN. Finally the reachability analysis has been used to verify formal ATC model. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#440
-
Jamil 2024
Knowledge Graph Enhancement for Improved Natural Language Health Question Answering using Large Language Models
36th International Conference on Scientific and Statistical Database Management (SSDBM) 2024;(): Rennes Univ, Inria Centre, Rennes, FRANCE Assoc Computing Machinery 2024 DOI: 10.1145/3676288.3676289 · Ref ID: 3461 In this paper we present a method for enhancing Question Answering (QA) systems by iteratively improving Knowledge Graphs (KGs) with a focus on maintaining monotonicity in the enhancement process. We introduce a mathematical framework employing functions tau and phi, where tau transforms text phi into a KG K, and phi generates an answer from T for a given question. We propose that augmenting K with domain-specific information, denoted as Delta(K), leads to a more accurate approximation of the expected answer, adhering to the principle that each enhancement either maintains or improves answer quality. This concept is formalized as phi(-1) (phi (T)boolean OR Delta(K)) yielding better results than phi(-1) (phi (T)). The paper elaborates on this process with practical examples, demonstrating how KG enhancements, under the constraints of monotonicity, lead to successive improvements in the Question Answering (QA) system. |
Kwesi
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#292
-
Janatian 2023
From Text to Structure: Using Large Language Models to Support the Development of Legal Expert Systems
36th Annual International Conference on Legal Knowledge and Information Systems (JURIX) 2023;379():167-176 Maastricht Univ, Maastricht, NETHERLANDS Ios Press 2023 DOI: 10.3233/faia230962 · Ref ID: 3645 Encoding legislative text in a formal representation is an important prerequisite to different tasks in the field of AI & Law. For example, rule-based expert systems focused on legislation can support laypeople in understanding how legislation applies to them and provide them with helpful context and information. However, the process of analyzing legislation and other sources to encode it in the desired formal representation can be time-consuming and represents a bottleneck in the development of such systems. Here, we investigate to what degree large language models (LLMs), such as GPT-4, are able to automatically extract structured representations from legislation. We use LLMs to create pathways from legislation, according to the JusticeBot methodology for legal decision support systems, evaluate the pathways and compare them to manually created pathways. The results are promising, with 60% of generated pathways being rated as equivalent or better than manually created ones in a blind comparison. The approach suggests a promising path to leverage the capabilities of LLMs to ease the costly development of systems based on symbolic approaches that are transparent and explainable. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1809
-
Ji 2024
Research on Knowledge Injection Method for Large Language Model Oriented to Process Specification Texts
J. Frontier. Comput. Sci. Technol. 2024;18(9):2361-2369 2024 DOI: 10.3778/j.issn.1673-9418.2406067 · Ref ID: 3806 The application of large language models in process specifications is an effective approach to addressing the issue of inaccurate process knowledge queries. At present, the domain model construction methods through domain knowledge graph embedding or fine-tuning with instruction data are not effective. The difficulty lies in the fact that the process knowledge in the process specifications involves relationships between multiple process elements, which is highly complex. The data are sparse because the standards are only used through citation. The high complexity of process knowledge and sparse data limit the model’s ability to learn process domain concepts, the relationships between concepts and attributes, the relationships between concepts, the relationships between multiple concepts, and reference- based knowledge. To address this difficulty, this paper proposes a large language model knowledge injection method for process specification texts. According to the characteristics of process specification data, this paper designs knowledge injection data including auxiliary sentence identification task, concept-chapter generation task, chapter continuation task and chapter-summary generation task. The model is fine-tuned through supervised learning by combining question-answer pair data to inject domain concepts, attributes, relationships between multiple concepts, and reference knowledge into the model. Experimental results show that the model trained with knowledge injection data and question-answer pair data improves ACC (accuracy) by 7.3 percentage points, ROUGE-L by 7.4 percentage points, and BLEU-4 by 6.2 percentage points compared with the model trained only with question-answer pair data, indicating the effectiveness of the proposed knowledge injection method. © 2024 Journal of Computer Engineering and Applications Beijing Co., Ltd.; Science Press. All rights reserved. |
Ishan
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1988
-
Ji 2022
Transferring Knowledge from Structure-aware Self-attention Language Model to Sequence-to-Sequence Semantic Parsing
Proceedings - International Conference on Computational Linguistics, COLING 2022;29():3164-3174 Association for Computational Linguistics (ACL) 2022 Ref ID: 5347 Semantic parsing considers the task of mapping a natural language sentence into a target formal representation, where various sophisticated sequence-to-sequence (seq2seq) models have been applied with promising results. Generally, these target representations follow a syntax formalism that limits permitted forms. However, it is neither easy nor flexible to explicitly integrate this syntax formalism into a neural seq2seq model. In this paper, we present a structure-aware self-attention language model to capture structural information of target representations and propose a knowledge distillation based approach to incorporating the target language model into a seq2seq model, where grammar rules or sketches are not required in the training process. An ablation study shows that the proposed language model can notably improve the performance of the baseline model. The experiments show that our method achieves new state-of-the-art performance among neural approaches on four semantic parsing (ATIS, GEO) and Python code generation (Django, CoNaLa) tasks. © 2022 Proceedings - International Conference on Computational Linguistics, COLING. All rights reserved. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#80
-
Ji 2021
C-CLUE: A Benchmark of Classical Chinese Based on a Crowdsourcing System for Knowledge Graph Construction
6th China Conference on Knowledge Graph and Semantic Computing (CCKS) 2021;1466():295-301 Guangzhou, PEOPLES R CHINA Springer-Verlag Singapore Pte Ltd 2021 DOI: 10.1007/978-981-16-6471-7_24 · Ref ID: 3286 Knowledge Graph Construction (KGC) aims to organize and visualize knowledge, which is based on tasks of Named Entity Recognition (NER) and Relation Extraction (RE). However, the difficulty of comprehension, caused by the differences in grammars and semantics between classical and modern Chinese, makes entity and relation annotations time-consuming and labour-intensive in classical Chinese corpus. In this paper, we design a novel crowdsourcing annotation system, which can gather collective intelligence as well as utilize domain knowledge to achieve efficient annotation and obtain finegrained datasets with high quality. More specifically, we judge the user professionalism, calculated by online tests, considered in annotation results integration and rewards assignment, which plays a vital role in improving the accuracy of annotation. Moreover, we evaluate several pre-training language models, the state-of-the-art methods in Natural Language Processing (NLP), on the benchmark datasets obtained by the system over tasks of NER and RE. Benchmark datasets, implementation details, and evaluation processes are available at https://github.com/jizijing/C-CLUE. The accessURLof the crowdsourcing annotation system is: http://152.136.45.252:60002/pages/login.html. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1972
-
Ji 2023
Towards Mitigating Hallucination in Large Language Models via Self-Reflection
Findings of the Association for Computational Linguistics: EMNLP 2023 2023;():1827-1843 Association for Computational Linguistics (ACL) 2023 Ref ID: 5046 Large language models (LLMs) have shown promise for generative and knowledge-intensive tasks including question-answering (QA) tasks. However, the practical deployment still faces challenges, notably the issue of "hallucination", where models generate plausible-sounding but unfaithful or nonsensical information. This issue becomes particularly critical in the medical domain due to the uncommon professional concepts and potential social risks involved. This paper analyses the phenomenon of hallucination in medical generative QA systems using widely adopted LLMs and datasets. Our investigation centers on the identification and comprehension of common problematic answers, with a specific emphasis on hallucination. To tackle this challenge, we present an interactive self-reflection methodology that incorporates knowledge acquisition and answer generation. Through this feedback process, our approach steadily enhances the factuality, consistency, and entailment of the generated answers. Consequently, we harness the interactivity and multitasking ability of LLMs and produce progressively more precise and accurate answers. Experimental results on both automatic and human evaluation demonstrate the superiority of our approach in hallucination reduction compared to baselines. © 2023 Association for Computational Linguistics. |
Davis
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3653
-
Jia 2024
Leveraging Large Language Models for Semantic Query Processing in a Scholarly Knowledge Graph
arXiv 2024;(): 2024 Ref ID: 8314 The proposed research aims to develop an innovative semantic query processing system that enables users to obtain comprehensive information about research works produced by Computer Science (CS) researchers at the Australian National University (ANU). The system integrates Large Language Models (LLMs) with the ANU Scholarly Knowledge Graph (ASKG), a structured repository of all research-related artifacts produced at ANU in the CS field. Each artifact and its parts are represented as textual nodes stored in a Knowledge Graph (KG). To address the limitations of traditional scholarly KG construction and utilization methods, which often fail to capture fine-grained details, we propose a novel framework that integrates the Deep Document Model (DDM) for comprehensive document representation and the KG-enhanced Query Processing (KGQP) for optimized complex query handling. DDM enables a fine-grained representation of the hierarchical structure and semantic relationships within academic papers, while KGQP leverages the KG structure to improve query accuracy and efficiency with LLMs. By combining the ASKG with LLMs, our approach enhances knowledge utilization and natural language understanding capabilities. The proposed system employs an automatic LLM-SPARQL fusion to retrieve relevant facts and textual nodes from the ASKG. Initial experiments demonstrate that our framework is superior to baseline methods in terms of accuracy retrieval and query efficiency. We showcase the practical application of our framework in academic research scenarios, highlighting its potential to revolutionize scholarly knowledge management and discovery. This work empowers researchers to acquire and utilize knowledge from documents more effectively and provides a foundation for developing precise and reliable interactions with LLMs. |
Ishan
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#566
-
Jia 2022
The Method for Plausibility Evaluation of Knowledge Triple Based on QA
7th China Conference on Knowledge Graph and Semantic Computing (CCKS) 2022;1711():228-235 Qinhuangdao, PEOPLES R CHINA Springer International Publishing Ag 2022 DOI: 10.1007/978-981-19-8300-9_25 · Ref ID: 3376 At present, most of the methods for knowledge graph completion (KGC) task highly rely on external knowledge base or graph representation learning. However, how to complete this task without using any external prior knowledge is still a huge challenge and difficulty. To this end, we propose a novel framework which converts the plausibility evaluation of knowledge triple task to the question and answer (QA) task with the thought of KG-BERT and prompt learning. We also test the effect of different question types on the results. Secondly, by fine-tuning two pre-trained language models BERT-wwm-ext and ERNIE-Gram on these generated sequences, so that they can complete the QA task. We won the 5th place at CCKS 2022 track 1 rematch stage, which proved the effectiveness of our method. |
Ishan
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#119
-
Jiang 2023
COMBO: A Complete Benchmark for Open KG Canonicalization
17th Conference of the European-Chapter of the Association-for-Computational-Linguistics (EACL) 2023;():340-357 Dubrovnik, CROATIA Assoc Computational Linguistics-Acl 2023 Ref ID: 3329 Open knowledge graph (KG) consists of (subject, relation, object) triples extracted from millions of raw text. The subject and object noun phrases and the relation in open KG have severe redundancy and ambiguity and need to be canonicalized. Existing datasets for open KG canonicalization only provide gold entitylevel canonicalization for noun phrases. In this paper, we present COMBO, a Complete Benchmark for Open KG canonicalization. Compared with existing datasets, we additionally provide gold canonicalization for relation phrases, gold ontology-level canonicalization for noun phrases, as well as source sentences from which triples are extracted. We also propose metrics for evaluating each type of canonicalization. On the COMBO dataset, we empirically compare previously proposed canonicalization methods as well as a few simple baseline methods based on pretrained language models. We find that properly encoding the phrases in a triple using pretrained language models results in better relation canonicalization and ontology-level canonicalization of the noun phrase. We release our dataset, baselines, and evaluation scripts at https://github.com/ jeffchy/COMBO/tree/main. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3948
-
Jiang 2020
Understanding Contexts Inside Robot and Human Manipulation Tasks through a Vision-Language Model and Ontology System in a Video Stream
arXiv 2020;(): 2020 Ref ID: 7389 Manipulation tasks in daily life, such as pouring water, unfold intentionally under specialized manipulation contexts. Being able to process contextual knowledge in these Activities of Daily Living (ADLs) over time can help us understand manipulation intentions, which are essential for an intelligent robot to transition smoothly between various manipulation actions. In this paper, to model the intended concepts of manipulation, we present a vision dataset under a strictly constrained knowledge domain for both robot and human manipulations, where manipulation concepts and relations are stored by an ontology system in a taxonomic manner. Furthermore, we propose a scheme to generate a combination of visual attentions and an evolving knowledge graph filled with commonsense knowledge. Our scheme works with real-world camera streams and fuses an attention-based Vision-Language model with the ontology system. The experimental results demonstrate that the proposed scheme can successfully represent the evolution of an intended object manipulation procedure for both robots and humans. The proposed scheme allows the robot to mimic human-like intentional behaviors by watching real-time videos. We aim to develop this scheme further for real-world robot intelligence in Human-Robot Interaction. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3128
-
Jiang 2024
Enhancing Question Answering for Enterprise Knowledge Bases using Large Language Models
Database Systems for Advanced Applications: 29th International Conference, DASFAA 2024, Gifu, Japan, July 2–5, 2024, Proceedings, Part IV 2024;():273–290 Gifu, Japan Springer-Verlag 2024 DOI: 10.1007/978-981-97-5562-2_18 · Ref ID: 7249 |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3735
-
Jiang 2024
Neuron-Level Sequential Editing for Large Language Models
arXiv 2024;(): 2024 Ref ID: 8663 This work explores sequential model editing in large language models (LLMs), a critical task that involves modifying internal knowledge within LLMs continuously through multi-round editing, each incorporating updates or corrections to adjust the model outputs without the need for costly retraining. Existing model editing methods, especially those that alter model parameters, typically focus on single-round editing and often face significant challenges in sequential model editing-most notably issues of model forgetting and failure. To address these challenges, we introduce a new model editing method, namely \textbf{N}euron-level \textbf{S}equential \textbf{E}diting (NSE), tailored for supporting sequential model editing. Specifically, we optimize the target layer's hidden states using the model's original weights to prevent model failure. Furthermore, we iteratively select neurons in multiple layers for editing based on their activation values to mitigate model forgetting. Our empirical experiments demonstrate that NSE significantly outperforms current modifying parameters model editing methods, marking a substantial advancement in the field of sequential model editing. Our code is released on \url{https://github.com/jianghoucheng/NSE}. |
Mike
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3716
-
Jiang 2024
Mix-CPT: A Domain Adaptation Framework via Decoupling Knowledge Learning and Format Alignment
arXiv 2024;(): 2024 Ref ID: 8462 Adapting general large language models (LLMs) to specialized domains presents great challenges due to varied data distributions. This adaptation typically requires continual pre-training on massive domain-specific corpora to facilitate knowledge memorization, followed by training to apply this knowledge following human instructions and preferences. However, this method may result in inefficient knowledge memorization due to a lack of awareness of knowledge utilization and imposes substantial demands on LLMs to simultaneously learn knowledge utilization and format alignment with limited training samples. To facilitate the domain adaptation of LLM, we revise this process and propose a new domain adaptation framework including domain knowledge learning and general format alignment, called Mix-CPT. Specifically, we first conduct a knowledge mixture continual pre-training that concurrently focuses on knowledge memorization and utilization, allowing for mutual reinforcement. To avoid catastrophic forgetting during the continual pre-training process, we further incorporate a logit swap self-distillation constraint. Subsequently, leveraging the knowledge and capabilities acquired during continual pre-training, we efficiently perform instruction tuning and alignment with a few general training samples to achieve format alignment. Extensive experiments demonstrate that our proposed Mix-CPT framework can simultaneously improve the task-solving capabilities of LLMs on the target and general domains compared to the traditional adaptation methods. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1004
-
Jiang 2024
Augmenting NLP Models with Commonsense Knowledge
SpringerBriefs Comp. Sci. 2024;Part F2530():65-89 Springer 2024 DOI: 10.1007/978-981-97-0747-8_5 · Ref ID: 4308 This chapter focuses on augmenting NLP models with commonsense knowledge to enhance their performance in natural language understanding and generation tasks. We begin by discussing the importance of commonsense knowledge in NLP and the challenges faced by NLP models in reasoning with commonsense. We explore different types of commonsense knowledge and reasoning tasks, including multiple-choice tasks, open-ended QA, constrained NLG, and commonsense probing of language models. We then introduce the various techniques for augmenting NLP models with commonsense knowledge. We discuss the use of structured knowledge bases, such as ConceptNet, and the incorporation of graph networks for encoding structured knowledge. We also examine the augmentation of NLP models with un/semi-structured knowledge sources, such as text corpora and the use of dense passage retrieval for open-ended QA. Furthermore, we explore differentiable reasoning methods, such as DrFact, for reasoning with semi-structured knowledge. Finally, we discuss the use of neural knowledge models, such as COMET and LLMs, for incorporating commonsense knowledge. We explore the generation of commonsense knowledge graphs using LLMs and knowledge distillation techniques to create smaller, specialized commonsense models. We also examine the use of large language models for extracting relevant commonsense knowledge for reasoning. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1943
-
Jiang 2023
Text-Augmented Open Knowledge Graph Completion via Pre-Trained Language Models
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2023;():11161-11180 Association for Computational Linguistics (ACL) 2023 Ref ID: 5136 The mission of open knowledge graph (KG) completion is to draw new findings from known facts. Existing works that augment KG completion require either (1) factual triples to enlarge the graph reasoning space or (2) manually designed prompts to extract knowledge from a pre-trained language model (PLM), exhibiting limited performance and requiring expensive efforts from experts. To this end, we propose TAGREAL that automatically generates quality query prompts and retrieves support information from large text corpora to probe knowledge from PLM for KG completion. The results show that TAGREAL achieves state-of-the-art performance on two benchmark datasets. We find that TAGREAL has superb performance even with limited training data, outperforming existing embedding-based, graph-based, and PLM-based methods. © 2023 Association for Computational Linguistics. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1229
-
Jiang 2024
Efficient Knowledge Infusion via KG-LLM Alignment
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():2986-2999 Association for Computational Linguistics (ACL) 2024 Ref ID: 4235 To tackle the problem of domain-specific knowledge scarcity within large language models (LLMs), knowledge graph-retrieval-augmented method has been proven to be an effective and efficient technique for knowledge infusion. However, existing approaches face two primary challenges: knowledge mismatch between public available knowledge graphs and the specific domain of the task at hand, and poor information compliance of LLMs with knowledge graphs. In this paper, we leverage a small set of labeled samples and a large-scale corpus to efficiently construct domain-specific knowledge graphs by an LLM, addressing the issue of knowledge mismatch. Additionally, we propose a three-stage KG-LLM alignment strategy to enhance the LLM's capability to utilize information from knowledge graphs. We conduct experiments with a limited-sample setting on two biomedical question-answering datasets, and the results demonstrate that our approach outperforms existing baselines. © 2024 Association for Computational Linguistics. |
Davis
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1290
-
Jiayang 2024
EventGround: Narrative Reasoning by Grounding to Eventuality-centric Knowledge Graphs
2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():6622-6642 European Language Resources Association (ELRA) 2024 Ref ID: 4568 Narrative reasoning relies on the understanding of eventualities in story contexts, which requires a wealth of background world knowledge. To help machines leverage such knowledge, existing solutions can be categorized into two groups. Some focus on implicitly modeling eventuality knowledge by pretraining language models (LMs) with eventuality-aware objectives. However, this approach breaks down knowledge structures and lacks interpretability. Others explicitly collect world knowledge of eventualities into structured eventuality-centric knowledge graphs (KGs). However, existing research on leveraging these knowledge sources for free-texts is limited. In this work, we propose an initial comprehensive framework called EventGround, which aims to tackle the problem of grounding free-texts to eventuality-centric KGs for contextualized narrative reasoning. We identify two critical problems in this direction: the event representation and sparsity problems. We provide simple yet effective parsing and partial information extraction methods to tackle these problems. Experimental results demonstrate that our approach consistently outperforms baseline models when combined with graph neural network (GNN) or large language model (LLM) based graph reasoning models. Our framework, incorporating grounded knowledge, achieves state-of-the-art performance while providing interpretable evidence. © 2024 ELRA Language Resource Association: CC BY-NC 4.0. |
Xinchen
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2454
-
Jilani 2008
Formal Representations of the Data Flow Diagram: A Survey
2008 Advanced Software Engineering and Its Applications 2008;():153-158 2008 DOI: 10.1109/ASEA.2008.34 · Ref ID: 6076 Structured analysis and design methodology has now been replaced by object oriented analysis and design software development techniques. A major design artifact in structured approach is the data flow diagram (DFD). DFD is very important for the modernization of old legacy systems. It is also very useful in requirement elicitation. However, DFD lacks formalism and by representing DFD formally, ambiguity and inconsistencies can be removed. Formal representation of DFD and its formal semantics help in better understanding of requirements and design. In this paper, we present a survey of techniques that formally represent or give formal semantics to the data flow diagram. We analyze formal representation techniques using analysis parameters. On the basis of identified parameters, we present an analysis table, which describes the strengths and weaknesses of representation techniques. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1582
-
Jin 2024
Learning to Verify and Assure Cyber-Physical Systems
AIAA SciTech Forum and Exposition, 2024 2024;(): American Institute of Aeronautics and Astronautics Inc, AIAA 2024 DOI: 10.2514/6.2024-1853 · Ref ID: 4527 Certification of aircraft systems is a complex task that is difficult to automate requiring significant subjective decision making. Established certification standards such as CFR-25 and MIL-HDBK-516 require considerable subjective analysis to transform the certification requirements into meaningful actionable requirements. This prevents the automation of any verification, assurance and certification task. While established methods rely on automation of assurance case generation, several key tasks that still requires human input preventing the co-creation of designs, verification scenarios, evidences, and assurance cases. Building on the recent success of large language models, we develop a framework that enables the automated verification and assurance of cyber-physical systems. Our framework consists of two parts – (i) an automated pipeline permits the automatic extraction of information from system artifacts such that a they can be semantically linked using a graphical representation; and (ii) an automated pipeline that enables the synthesis of verification scenarios to co-generate evidence along with the assurance case of the cyber-physical systems. We demonstrate our framework on a concrete example of a landing gear sub-system of the aircraft and highlight the benefits that can be realized through the automation of the bottlenecks in the task of verification, assurance, and certification. Using semantically linked representations of the knowledge, we enable complex reasoning on the knowledge contained in system artifacts providing meaningful feedback to the designers and certifiers on means improve the overall system. Our investigations on the landing gear use-case demonstrates the feasibility of the use of large language models to support the systems engineering tasks for cyber-physical systems and the use of a knowledge graphs in the construction and assessment of the cyber-physical system’s assurance. © 2024 by the American Institute of Aeronautics and Astronautics, Inc. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1535
-
Jin 2022
A Knowledge-Enhanced Text Representation Toolkit for Natural Language Understanding
EMNLP 2022 - 2022 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Demonstrations Session 2022;():1-11 Association for Computational Linguistics (ACL) 2022 Ref ID: 5489 As the first step of modern natural language processing, text representation encodes discrete texts as continuous embeddings. Pre-trained language models (PLMs) have demonstrated strong ability in text representation and significantly promoted the development of natural language understanding (NLU). However, existing PLMs represent a text solely by its context, which is not enough to support knowledge-intensive NLU tasks. Knowledge is power, and fusing external knowledge explicitly into PLMs can provide knowledgeable text representations. Since previous knowledge-enhanced methods differ in many aspects, making it difficult for us to reproduce previous methods, implement new methods, and transfer between different methods. It is highly desirable to have a unified paradigm to encompass all kinds of methods in one framework. In this paper, we propose, a knowledge-enhanced text representation toolkit for natural language understanding. According to our proposed Unified Knowledge-Enhanced Paradigm (UniKEP), CogKTR consists of four key stages, including knowledge acquisition, knowledge representation, knowledge injection, and knowledge application. CogKTR currently supports easy-to-use knowledge acquisition interfaces, multi-source knowledge embeddings, diverse knowledge-enhanced models, and various knowledge-intensive NLU tasks. Our unified, knowledgeable and modular toolkit is publicly available at GitHub, with an online system and a short instruction video. © 2022 Association for Computational Linguistics. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3500
-
Jinensibieke 2024
How Good are LLMs at Relation Extraction under Low-Resource Scenario? Comprehensive Evaluation
arXiv 2024;(): 2024 Ref ID: 8393 Relation Extraction (RE) serves as a crucial technology for transforming unstructured text into structured information, especially within the framework of Knowledge Graph development. Its importance is emphasized by its essential role in various downstream tasks. Besides the conventional RE methods which are based on neural networks and pre-trained language models, large language models (LLMs) are also utilized in the research field of RE. However, on low-resource languages (LRLs), both conventional RE methods and LLM-based methods perform poorly on RE due to the data scarcity issues. To this end, this paper constructs low-resource relation extraction datasets in 10 LRLs in three regions (Central Asia, Southeast Asia and Middle East). The corpora are constructed by translating the original publicly available English RE datasets (NYT10, FewRel and CrossRE) using an effective multilingual machine translation. Then, we use the language perplexity (PPL) to filter out the low-quality data from the translated datasets. Finally, we conduct an empirical study and validate the performance of several open-source LLMs on these generated LRL RE datasets. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#615
-
Jing 2022
A Novel Named Entity Recognition Algorithm for Hot Strip Rolling Based on BERT-Imseq2seq-CRF Model
Named entity recognition is not only the first step of text information extraction, but also the key process of constructing domain knowledge graphs. In view of the large amount of text data, complex process flow and urgent application needs in the hot strip rolling process, a novel named entity recognition algorithm based on BERT-Imseq2seq-CRF model is proposed in this paper. Firstly, the algorithm uses the BERT preprocessing language model to mine the dependencies in the domain text and obtain the corresponding representation vector. Then, the representation vector is sent to the encoder layer, and the output vector is input to the decoder at the same time, on the premise that the original model only considers the semantic vector. The Teacher-Forcing mechanism is integrated into the decoder layer to randomly modify the labeling results, and error accumulation is avoided to guarantee the sequence recognition effect. Finally, the validity of the labeling results is checked according to the conditional random field constraints, and the overall labeling quality of the algorithm is improved. The experimental results show that this model can efficiently and accurately predict the physical label of hot strip rolling, and the model performance index is better than other models, with the F1-Score reaching 91.47%. This model further provides technical support for information extraction and domain knowledge graph construction of hot strip rolling. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1772
-
Jing 2023
Prompt-assisted Relation Fusion in Knowledge Graph Acquisition
Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics 2023;():2960-2965 Institute of Electrical and Electronics Engineers Inc. 2023 DOI: 10.1109/SMC53992.2023.10394554 · Ref ID: 4910 This paper investigated how prompt-based learning techniques can assist with relation fusion in Knowledge Graph (KG) acquisition. We created a unsupervised framework to generate a KG from a real-world dataset. The framework incorporates prompting with knowledge entity metadata and generating predicate embeddings with the pretrained Masked Language Model (MLM) RoBERTa. Predicate embeddings were clustered to form conceptual groups and feature tokens were used to derive relation labels. In addition, we conducted a comparative study on the effects of different prompting templates. The resulting relation labels were evaluated by human annotators, which indicated that prompt-based learning, if applied appropriately, can help with deducing conceptualized relations. Our framework proposed a way to improve the quality of KGs acquired using traditional Relation Extraction (RE). It can also assist human experts effectively in semi-automated knowledge acquisition. © 2023 IEEE. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#138
-
Jovanovic 2023
Connecting AI: Merging Large Language Models and Knowledge Graph
Combining the generative abilities of large language models with the logical and factual coherence of knowledge graphs using a connected artificial intelligence architecture minimizes each system's shortcomings and amplifies their strengths across many real-world domains. |
Srividya
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1437
-
Ju 2024
Investigating Multi-Hop Factual Shortcuts in Knowledge Editing of Large Language Models
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;1():8987-9001 Association for Computational Linguistics (ACL) 2024 Ref ID: 4380 Recent work has showcased the powerful capability of large language models (LLMs) in recalling knowledge and reasoning. However, the reliability of LLMs in combining these two capabilities into reasoning through multi-hop facts has not been widely explored. This paper systematically investigates the possibilities for LLMs to utilize shortcuts based on direct connections between the initial and terminal entities of multi-hop knowledge. We first explore the existence of factual shortcuts through Knowledge Neurons, revealing that: (i) the strength of factual shortcuts is highly correlated with the frequency of co-occurrence of initial and terminal entities in the pre-training corpora; (ii) few-shot prompting leverage more shortcuts in answering multi-hop questions compared to chain-of-thought prompting. Then, we analyze the risks posed by factual shortcuts from the perspective of multi-hop knowledge editing. Analysis shows that approximately 20% of the failures are attributed to shortcuts, and the initial and terminal entities in these failure instances usually have higher co-occurrences in the pre-training corpus. Finally, we propose erasing shortcut neurons to mitigate the associated risks and find that this approach significantly reduces failures in multiple-hop knowledge editing caused by shortcuts. Code is publicly available at https://github.com/Jometeorie/MultiHopShortcuts. © 2024 Association for Computational Linguistics. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#212
-
Kahlawi 2024
Enhancing Administrative Source Registers for the Development of a Robust Large Language Model: A Novel Methodological Approach
Int. J. Adv. Comput. Sci. Appl. 2024;15(7):9-17 2024 Ref ID: 3073 Accurate statistical information is critical for understanding, describing, and managing socio-economic systems. While data availability has increased, often it does not meet the quality requirements for effective governance. Administrative registers are crucial for statistical information production, but their potential is hampered by quality issues stemming from administrative inconsistencies. This paper explores the integration of semantic technologies, including ontologies and knowledge graphs, with administrative databases to improve data quality. We discuss the development of large language models (LLMs) that enable a robust, queryable framework, facilitating the integration of disparate data sources. This approach ensures high-quality administrative data, essential for statistical reuse and the development of comprehensive, dynamic knowledge graphs and LLMs tailored for administrative applications. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3472
-
Kalifa 2024
GOProteinGNN: Leveraging Protein Knowledge Graphs for Protein Representation Learning
arXiv 2024;(): 2024 Ref ID: 8500 Proteins play a vital role in biological processes and are indispensable for living organisms. Accurate representation of proteins is crucial, especially in drug development. Recently, there has been a notable increase in interest in utilizing machine learning and deep learning techniques for unsupervised learning of protein representations. However, these approaches often focus solely on the amino acid sequence of proteins and lack factual knowledge about proteins and their interactions, thus limiting their performance. In this study, we present GOProteinGNN, a novel architecture that enhances protein language models by integrating protein knowledge graph information during the creation of amino acid level representations. Our approach allows for the integration of information at both the individual amino acid level and the entire protein level, enabling a comprehensive and effective learning process through graph-based learning. By doing so, we can capture complex relationships and dependencies between proteins and their functional annotations, resulting in more robust and contextually enriched protein representations. Unlike previous fusion methods, GOProteinGNN uniquely learns the entire protein knowledge graph during training, which allows it to capture broader relational nuances and dependencies beyond mere triplets as done in previous work. We perform a comprehensive evaluation on several downstream tasks demonstrating that GOProteinGNN consistently outperforms previous methods, showcasing its effectiveness and establishing it as a state-of-the-art solution for protein representation learning. |
Mike
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#479
-
Kalo 2020
KnowlyBERT - Hybrid Query Answering over Language Models and Knowledge Graphs
19th International Semantic Web Conference (ISWC) 2020;12506():294-310 Athens, GREECE Springer International Publishing Ag 2020 DOI: 10.1007/978-3-030-62419-4_17 · Ref ID: 2919 Providing a plethora of entity-centric information, Knowledge Graphs have become a vital building block for a variety of intelligent applications. Indeed, modern knowledge graphs like Wikidata already capture several billions of RDF triples, yet they still lack a good coverage for most relations. On the other hand, recent developments in NLP research show that neural language models can easily be queried for relational knowledge without requiring massive amounts of training data. In this work, we leverage this idea by creating a hybrid query answering system on top of knowledge graphs in combination with the masked language model BERT to complete query results. We thus incorporate valuable structural and semantic information from knowledge graphs with textual knowledge from language models to achieve high precision query results. Standard techniques for dealing with incomplete knowledge graphs are either (1) relation extraction which requires massive amounts of training data or (2) knowledge graph embeddings which have problems to succeed beyond simple baseline datasets. Our hybrid system KnowlyBERT requires only small amounts of training data, while outperforming state-of-the-art techniques by boosting their precision by over 30% in our large Wikidata experiment. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1616
-
Kalo 2023
LM-KBC 2023: 2nd Challenge on Knowledge Base Construction from Pre-trained Language Models
CEUR Workshop Proceedings 2023;3577(): CEUR-WS 2023 Ref ID: 5031 Large language models (LLMs) like chatGPT [1] have advanced a range of semantic tasks and are being ubiquitously used for knowledge extraction. Although several works have explored this ability by crafting prompts with in-context or instruction learning, the viability of complete and precise knowledge base construction from LMs is still in its nascent form. In the 2nd edition of this challenge, we invited participants to extract disambiguated knowledge triples from LMs for a given set of subjects and relations. In crucial difference to existing probing benchmarks like LAMA [2], we made no simplifying assumptions on relation cardinalities, i.e., a subject-entity can stand in relation with zero, one, or many object-entities. Furthermore, submissions needed to go beyond just ranking predicted surface strings, and materialize disambiguated entities in the output, which were evaluated using established KB metrics of precision, recall, and F1-score. The challenge had two tracks: (1) a small model track, where models with < 1 billion parameters could be probed, and (2) an open track, where participants could use any LM of their choice. We received seven submissions, two for track 1 and five for track 2. We present the contributions and insights of the submitted peer-reviewed submissions and lay out the possible paths for future work. All the details related to the challenge can be found on our website at https://lm-kbc.github.io/challenge2023/. © 2023 CEUR-WS. All rights reserved. |
Mike
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3723
-
Kalyanpur 2024
Multi-step Inference over Unstructured Data
arXiv 2024;(): 2024 Ref ID: 8423 The advent of Large Language Models (LLMs) and Generative AI has revolutionized natural language applications across various domains. However, high-stakes decision-making tasks in fields such as medical, legal and finance require a level of precision, comprehensiveness, and logical consistency that pure LLM or Retrieval-Augmented-Generation (RAG) approaches often fail to deliver. At Elemental Cognition (EC), we have developed a neuro-symbolic AI platform to tackle these problems. The platform integrates fine-tuned LLMs for knowledge extraction and alignment with a robust symbolic reasoning engine for logical inference, planning and interactive constraint solving. We describe Cora, a Collaborative Research Assistant built on this platform, that is designed to perform complex research and discovery tasks in high-stakes domains. This paper discusses the multi-step inference challenges inherent in such domains, critiques the limitations of existing LLM-based methods, and demonstrates how Cora's neuro-symbolic approach effectively addresses these issues. We provide an overview of the system architecture, key algorithms for knowledge extraction and formal reasoning, and present preliminary evaluation results that highlight Cora's superior performance compared to well-known LLM and RAG baselines. |
Ishan
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#567
-
Kaneda 2023
A Method to Constract a Masked Knowlege Graph Model using Transformer for Knowledge Graph Reasoning
17th IEEE International Conference on Semantic Computing (ICSC) 2023;():298-299 Laguna Hills, CA Ieee Computer Soc 2023 DOI: 10.1109/icsc56153.2023.00061 · Ref ID: 2966 Most of the previous methods using machine learning for this challenge generate a new knowledge graph from the original one, and some information is lost in the process of creating a new knowledge graph. Therefore, we proposed a new model to estimate the criminal without changing the original knowledge graph. The proposed model uses a Transformer and allows the estimation of unknown criminals in nonexistent scenes by learning similar to Masked Language Modeling in BERT. This model, which uses the original knowledge graph, is expected to infer information about the crime scene at the same time as predicting the criminal. We confirmed by experiments that the model had gained the ability to estimate the hidden story parts by considering the surrounding stories. |
Mike
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#467
-
Kang 2024
Knowledge-aware adaptive graph network for commonsense question answering
Commonsense Question Answering (CQA) aims to select the correct answers to common knowledge questions. Most existing approaches focus on integrating external knowledge graph (KG) representations with question context representations to facilitate reasoning. However, the approaches cannot effectively select the correct answer due to (i) the incomplete reasoning chains when using knowledge graphs as external knowledge, and (ii) the insufficient understanding of semantic information of the question during the reasoning process. Here we propose a novel model, KA-AGN. First, we utilize a joint representation of dependency parse trees and language models to describe QA pairs. Next, we introduce question semantic information as nodes into a knowledge subgraph and compute the correlations between nodes using adaptive graph networks. Finally, bidirectional attention and graph pruning are employed to update the question representation and the knowledge subgraph representation. To evaluate the performance of our method, we conducted experiments on two widely used benchmark datasets: CommonsenseQA and OpenBookQA. The ablation experiment results demonstrate the effectiveness of the adaptive graph network in enhancing reasoning chains, while showing the ability of the joint representation of dependency parse trees and language models to correctly understand question semantics. Our code is publicly available at https://github.com/agfsghfdhg/KAAGN-main. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3610
-
Kang 2023
Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks
arXiv 2023;(): 2023 Ref ID: 7736 Large Language Models (LLMs) have shown promising performance in knowledge-intensive reasoning tasks that require a compound understanding of knowledge. However, deployment of the LLMs in real-world applications can be challenging due to their high computational requirements and concerns on data privacy. Previous studies have focused on building task-specific small Language Models (LMs) by fine-tuning them with labeled data or distilling LLMs. However, these approaches are ill-suited for knowledge-intensive reasoning tasks due to the limited capacity of small LMs in memorizing the knowledge required. Motivated by our theoretical analysis on memorization, we propose Knowledge-Augmented Reasoning Distillation (KARD), a novel method that fine-tunes small LMs to generate rationales obtained from LLMs with augmented knowledge retrieved from an external knowledge base. Moreover, we further propose a neural reranker to obtain documents relevant to rationale generation. We empirically show that KARD significantly improves the performance of small T5 and GPT models on the challenging knowledge-intensive reasoning datasets, namely MedQA-USMLE, StrategyQA, and OpenbookQA. Notably, our method makes the 250M T5 models achieve superior performance against the fine-tuned 3B models, having 12 times larger parameters, on both MedQA-USMLE and StrategyQA benchmarks. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3236
-
Kang 2024
Bridging Law and Data: Augmenting Reasoning via a Semi-Structured Dataset with IRAC methodology
arXiv 2024;(): 2024 Ref ID: 8402 The effectiveness of Large Language Models (LLMs) in legal reasoning is often limited due to the unique legal terminologies and the necessity for highly specialized knowledge. These limitations highlight the need for high-quality data tailored for complex legal reasoning tasks. This paper introduces LEGALSEMI, a benchmark specifically curated for legal scenario analysis. LEGALSEMI comprises 54 legal scenarios, each rigorously annotated by legal experts, based on the comprehensive IRAC (Issue, Rule, Application, Conclusion) framework. In addition, LEGALSEMI is accompanied by a structured knowledge graph (SKG). A series of experiments were conducted to assess the usefulness of LEGALSEMI for IRAC analysis. The experimental results demonstrate the effectiveness of incorporating the SKG for issue identification, rule retrieval, application and conclusion generation using four different LLMs. LEGALSEMI will be publicly available upon acceptance of this paper. |
Kwesi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3917
-
Kannan 2024
A Timeline and Analysis for Representation Plasticity in Large Language Models
arXiv 2024;(): 2024 Ref ID: 8676 The ability to steer AI behavior is crucial to preventing its long term dangerous and catastrophic potential. Representation Engineering (RepE) has emerged as a novel, powerful method to steer internal model behaviors, such as "honesty", at a top-down level. Understanding the steering of representations should thus be placed at the forefront of alignment initiatives. Unfortunately, current efforts to understand plasticity at this level are highly neglected. This paper aims to bridge the knowledge gap and understand how LLM representation stability, specifically for the concept of "honesty", and model plasticity evolve by applying steering vectors extracted at different fine-tuning stages, revealing differing magnitudes of shifts in model behavior. The findings are pivotal, showing that while early steering exhibits high plasticity, later stages have a surprisingly responsive critical window. This pattern is observed across different model architectures, signaling that there is a general pattern of model plasticity that can be used for effective intervention. These insights greatly contribute to the field of AI transparency, addressing a pressing lack of efficiency limiting our ability to effectively steer model behavior. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1349
-
Karacapilidis 2024
Generative AI and Public Deliberation: A Framework for LLM-augmented Digital Democracy
CEUR Workshop Proceedings 2024;3737(): CEUR-WS 2024 Ref ID: 4461 Aiming to augment the effectiveness and scalability of existing digital deliberation platforms, while also facilitating evidence-based collective decision making and increasing citizen participation and trust, this article (i) reviews state-of-the-art applications of LLMs in diverse public deliberation issues; (ii) proposes a novel digital deliberation framework that meaningfully incorporates Knowledge Graphs and neuro-symbolic reasoning approaches to improve the factual accuracy and reasoning capabilities of LLMs, and (iii) demonstrates the potential of the proposed solution through two key deliberation tasks, namely fact checking and argument building. The article provides insights about how modern AI technology should be used to address the equity perspective, helping citizens to construct robust and informed arguments, refine their prose, and contribute comprehensible feedback; and aiding policy makers in obtaining a deep understanding of the evolution and outcome of a deliberation. © 2024 Copyright for this paper by its authors. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#23
-
Kardos 2023
Are These Descriptions Referring to the Same Entity or Just to Similar Ones?
19th International Conference on Artificial Intelligence Applications and Innovations (AIAI) 2023;676():387-398 Leon, SPAIN Springer International Publishing Ag 2023 DOI: 10.1007/978-3-031-34107-6_31 · Ref ID: 3151 The Knowledge Graph matching task is to identify nodes in the two graphs that refer to the same concept. In this paper, we focus on the analysis of textual descriptions of the concepts. We employ neural language models as they can score well on text content similarity On the other hand, we show that the text similarity of entity descriptions does not equal to referring to the exact same entity. Our text-based multi-step system was among the top participants at the Knowledge Graph matching track of the Ontology Alignment Evaluation Initiative. |
Mike
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#570
-
Kasner 2023
Mind the Labels: Describing Relations in Knowledge GraphsWith Pretrained Models
17th Conference of the European-Chapter of the Association-for-Computational-Linguistics (EACL) 2023;():2398-2415 Dubrovnik, CROATIA Assoc Computational Linguistics-Acl 2023 Ref ID: 3161 Pretrained language models (PLMs) for data-totext (D2T) generation can use human-readable data labels such as column headings, keys, or relation names to generalize to out-of-domain examples. However, the models are wellknown in producing semantically inaccurate outputs if these labels are ambiguous or incomplete, which is often the case in D2T datasets. In this paper, we expose this issue on the task of descibing a relation between two entities. For our experiments, we collect a novel dataset for verbalizing a diverse set of 1,522 unique relations from three large-scale knowledge graphs (Wikidata, DBPedia, YAGO). We find that although PLMs for D2T generation expectedly fail on unclear cases, models trained with a large variety of relation labels are surprisingly robust in verbalizing novel, unseen relations. We argue that using data with a diverse set of clear and meaningful labels is key to training D2T generation systems capable of generalizing to novel domains. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3268
-
Kau 2024
Combining Knowledge Graphs and Large Language Models
arXiv 2024;(): 2024 Ref ID: 8452 In recent years, Natural Language Processing (NLP) has played a significant role in various Artificial Intelligence (AI) applications such as chatbots, text generation, and language translation. The emergence of large language models (LLMs) has greatly improved the performance of these applications, showing astonishing results in language understanding and generation. However, they still show some disadvantages, such as hallucinations and lack of domain-specific knowledge, that affect their performance in real-world tasks. These issues can be effectively mitigated by incorporating knowledge graphs (KGs), which organise information in structured formats that capture relationships between entities in a versatile and interpretable fashion. Likewise, the construction and validation of KGs present challenges that LLMs can help resolve. The complementary relationship between LLMs and KGs has led to a trend that combines these technologies to achieve trustworthy results. This work collected 28 papers outlining methods for KG-powered LLMs, LLM-based KGs, and LLM-KG hybrid approaches. We systematically analysed and compared these approaches to provide a comprehensive overview highlighting key trends, innovative techniques, and common challenges. This synthesis will benefit researchers new to the field and those seeking to deepen their understanding of how KGs and LLMs can be effectively combined to enhance AI applications capabilities. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3751
-
Ke 2024
oRetrieval Augmented Generation for 10 Large Language Models and its Generalizability in Assessing Medical Fitness
arXiv 2024;(): 2024 Ref ID: 8688 Large Language Models (LLMs) show potential for medical applications but often lack specialized clinical knowledge. Retrieval Augmented Generation (RAG) allows customization with domain-specific information, making it suitable for healthcare. This study evaluates the accuracy, consistency, and safety of RAG models in determining fitness for surgery and providing preoperative instructions. We developed LLM-RAG models using 35 local and 23 international preoperative guidelines and tested them against human-generated responses. A total of 3,682 responses were evaluated. Clinical documents were processed using Llamaindex, and 10 LLMs, including GPT3.5, GPT4, and Claude-3, were assessed. Fourteen clinical scenarios were analyzed, focusing on seven aspects of preoperative instructions. Established guidelines and expert judgment were used to determine correct responses, with human-generated answers serving as comparisons. The LLM-RAG models generated responses within 20 seconds, significantly faster than clinicians (10 minutes). The GPT4 LLM-RAG model achieved the highest accuracy (96.4% vs. 86.6%, p=0.016), with no hallucinations and producing correct instructions comparable to clinicians. Results were consistent across both local and international guidelines. This study demonstrates the potential of LLM-RAG models for preoperative healthcare tasks, highlighting their efficiency, scalability, and reliability. |
yuexi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2859
-
Keber 2024
A Review on Neuro-symbolic AI Improvements to Natural Language Processing
2024 47th MIPRO ICT and Electronics Convention (MIPRO) 2024;():66-72 2024 DOI: 10.1109/MIPRO60963.2024.10569741 · Ref ID: 6121 Symbolic artificial intelligence (AI) reflects the domain knowledge of experts and adheres to the logic of the subject area, rules, or any relations between entities. Connectionist (neuro) approaches based on artificial neural networks are excellent for extracting abstract features, contextualizing, and embedding interactions between features. When connectionist and symbolic approaches are properly aligned in a model, they benefit from complementary strengths; the combination is referred to as a hybrid or neuro-symbolic artificial intelligence (NSAI) model. The advantages that NSAI brings to the field of natural language processing (NLP) have received little attention from researchers in recent years. Therefore, in this review, we focus on the impact of neuro-symbolic approaches for NLP tasks, i.e. text classification, information extraction, machine translation, and language understanding. Relevant research articles from Scopus, Web of Science, and Google Scholar were carefully examined using appropriate keywords in the period from 2019 to 2024. The review aims to show the types of NSAI systems, identify the motivation for using NSAI, evaluate the use of additional annotations for content description, and briefly describe how the neuro-symbolic connection improves the methodology and enables trustworthy and explainable AI systems in current NLP research. The review also highlights areas of application and improvements achieved by NSAI approaches in benchmarks. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3670
-
Khan 2024
LLM+KG@VLDB'24 Workshop Summary
arXiv 2024;(): 2024 Ref ID: 8651 The unification of large language models (LLMs) and knowledge graphs (KGs) has emerged as a hot topic. At the LLM+KG'24 workshop, held in conjunction with VLDB 2024 in Guangzhou, China, one of the key themes explored was important data management challenges and opportunities due to the effective interaction between LLMs and KGs. This report outlines the major directions and approaches presented by various speakers during the LLM+KG'24 workshop. |
Davis
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2586
-
Kharitonov 2022
Intelligent Technologies for Projective Thinking and Research Management in the Knowledge Representation System
2022 International Conference on Quality Management, Transport and Information Security, Information Technologies (IT&QM&IS) 2022;():292-295 2022 DOI: 10.1109/ITQMIS56172.2022.9976719 · Ref ID: 6044 It is proposed to address existing methodological issues in the educational process with the development of intellectual technologies and knowledge representation systems to improve the efficiency of higher education institutions. For this purpose, the structure of relational database is proposed, it will store the information about defended dissertations in the form of a set of attributes (heuristics), representing the mandatory qualification attributes of theses. An inference algorithm is proposed to process the information. This algorithm represents an artificial intelligence, its work is aimed at generating queries based on the applicant preferences. The result of the algorithm's work will be a set of choices, presented in ranked order. Given technologies will allow applicants to quickly become familiar with known scientific results and serve as a starting point for new research. The demand for co-researcher practice in solving the problem of updating the projective thinking methodology and managing the scientific research process has been justified. This article pays attention to the existing parallels between the concepts of technical and human sciences in the framework of their convergence. The concepts of being (economic good and economic utility) and the concepts of consciousness (humanitarian economic good and humanitarian economic utility) are used to form projective thinking. They form direct and inverse correspondences of technology and humanitarian practice in the techno-humanitarian mathematical space. It is proposed to place processed information from the language of context-free formal grammar dissertation abstracts in this space. The principle of data manipulation based on formal languages with context-free grammar allows to create new structures of subject areas in terms of applicants' preferences.It is believed that the success of applicants’ work depends directly on the cognitive training of applicants, which needs to be practiced psychologically. This practice is based on deepening the objectivity and adequacy qualities of obtaining information on the basis of heuristic methods. It requires increased attention and development of intelligence. The paper studies the use of heuristic methods by applicants to find new research directions leads to several promising results. These results can be perceived as potential options in future research. This contributes to an increase in the level of retention of higher education professionals. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3363
-
Khlaut 2024
Efficient Medical Question Answering with Knowledge-Augmented Question Generation
arXiv 2024;(): 2024 Ref ID: 8312 In the expanding field of language model applications, medical knowledge representation remains a significant challenge due to the specialized nature of the domain. Large language models, such as GPT-4, obtain reasonable scores on medical question answering tasks, but smaller models are far behind. In this work, we introduce a method to improve the proficiency of a small language model in the medical domain by employing a two-fold approach. We first fine-tune the model on a corpus of medical textbooks. Then, we use GPT-4 to generate questions similar to the downstream task, prompted with textbook knowledge, and use them to fine-tune the model. Additionally, we introduce ECN-QA, a novel medical question answering dataset containing ``progressive questions'' composed of related sequential questions. We show the benefits of our training strategy on this dataset. The study's findings highlight the potential of small language models in the medical domain when appropriately fine-tuned. The code and weights are available at https://github.com/raidium-med/MQG. |
Ishan
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1300
-
Khorashadizadeh 2023
Exploring In-Context Learning Capabilities of Foundation Models for Generating Knowledge Graphs from Text
CEUR Workshop Proceedings 2023;3447():132-153 CEUR-WS 2023 Ref ID: 5243 Knowledge graphs can represent information about the real-world using entities and their relations in a structured and semantically rich manner and they enable a variety of downstream applications such as question-answering, recommendation systems, semantic search, and advanced analytics. However, at the moment, building a knowledge graph involves a lot of manual effort and thus hinders their application in some situations and the automation of this process might benefit especially for small organizations. Automatically generating structured knowledge graphs from a large volume of natural language is still a challenging task and the research on sub-tasks such as named entity extraction, relation extraction, entity and relation linking, and knowledge graph construction aims to improve the state of the art of automatic construction and completion of knowledge graphs from text. The recent advancement of foundation models with billions of parameters trained in a self-supervised manner with large volumes of training data that can be adapted to a variety of downstream tasks has helped to demonstrate high performance on a large range of Natural Language Processing (NLP) tasks. In this context, one emerging paradigm is in-context learning where a language model is used as it is with a prompt that provides instructions and some examples to perform a task without changing the parameters of the model using traditional approaches such as fine-tuning. This way, no computing resources are needed for re-training/fine-tuning the models and the engineering effort is minimal. Thus, it would be beneficial to utilize such capabilities for generating knowledge graphs from text. In this paper, grounded by several research questions, we explore the capabilities of foundation models such as ChatGPT to generate knowledge graphs from the knowledge it captured during pre-training as well as the new text provided to it in the prompt. The paper provides a qualitative analysis of a set of example outputs generated by a foundation model with the aim of knowledge graph construction and completion. The results demonstrate promising capabilities. Furthermore, we discuss the challenges and next steps for this research work. © 2023 CEUR-WS. All rights reserved. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3828
-
Khorashadizadeh 2024
Research Trends for the Interplay between Large Language Models and Knowledge Graphs
arXiv 2024;(): 2024 Ref ID: 8379 This survey investigates the synergistic relationship between Large Language Models (LLMs) and Knowledge Graphs (KGs), which is crucial for advancing AI's capabilities in understanding, reasoning, and language processing. It aims to address gaps in current research by exploring areas such as KG Question Answering, ontology generation, KG validation, and the enhancement of KG accuracy and consistency through LLMs. The paper further examines the roles of LLMs in generating descriptive texts and natural language queries for KGs. Through a structured analysis that includes categorizing LLM-KG interactions, examining methodologies, and investigating collaborative uses and potential biases, this study seeks to provide new insights into the combined potential of LLMs and KGs. It highlights the importance of their interaction for improving AI applications and outlines future research directions. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#308
-
Kim 2022
Generative Model Using Knowledge Graph for Document-Grounded Conversations
Featured Application Core technology for document-grounded conversation. Document-grounded conversation (DGC) is a natural language generation task to generate fluent and informative responses by leveraging dialogue history and document(s). Recently, DGCs have focused on fine-tuning using pretrained language models. However, these approaches have a problem in that they must leverage the background knowledge under capacity constraints. For example, the maximum length of the input is limited to 512 or 1024 tokens. This problem is fatal in DGC because most documents are longer than the maximum input length. To address this problem, we propose a document-grounded generative model using a knowledge graph. The proposed model converts knowledge sentences extracted from the given document(s) into knowledge graphs and fine-tunes the pretrained model using the graph. We validated the effectiveness of the proposed model using a comparative experiment on the well-known Wizard-of-Wikipedia dataset. The proposed model outperformed the previous state-of-the-art model in our experiments on the Doc2dial dataset. |
Kwesi
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1339
-
Kim 2024
Fusarium Protein Toolkit: a web-based resource for structural and variant analysis of Fusarium species
Background: The genus Fusarium poses significant threats to food security and safety worldwide because numerous species of the fungus cause destructive diseases and/or mycotoxin contamination in crops. The adverse effects of climate change are exacerbating some existing threats and causing new problems. These challenges highlight the need for innovative solutions, including the development of advanced tools to identify targets for control strategies. Description: In response to these challenges, we developed the Fusarium Protein Toolkit (FPT), a web-based tool that allows users to interrogate the structural and variant landscape within the Fusarium pan-genome. The tool displays both AlphaFold and ESMFold-generated protein structure models from six Fusarium species. The structures are accessible through a user-friendly web portal and facilitate comparative analysis, functional annotation inference, and identification of related protein structures. Using a protein language model, FPT predicts the impact of over 270 million coding variants in two of the most agriculturally important species, Fusarium graminearum and F. verticillioides. To facilitate the assessment of naturally occurring genetic variation, FPT provides variant effect scores for proteins in a Fusarium pan-genome based on 22 diverse species. The scores indicate potential functional consequences of amino acid substitutions and are displayed as intuitive heatmaps using the PanEffect framework. Conclusion: FPT fills a knowledge gap by providing previously unavailable tools to assess structural and missense variation in proteins produced by Fusarium. FPT has the potential to deepen our understanding of pathogenic mechanisms in Fusarium, and aid the identification of genetic targets for control strategies that reduce crop diseases and mycotoxin contamination. Such targets are vital to solving the agricultural problems incited by Fusarium, particularly evolving threats resulting from climate change. Thus, FPT has the potential to contribute to improving food security and safety worldwide. © This is a U.S. Government work and not under copyright protection in the US; foreign copyright protection may apply 2024. |
Davis
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2393
-
Kim 2013
Entity Translation Mining from Comparable Corpora: Combining Graph Mapping with Corpus Latent Features
IEEE Transactions on Knowledge and Data Engineering 2013;25(8):1787-1800 2013 DOI: 10.1109/TKDE.2012.117 · Ref ID: 6460 This paper addresses the problem of mining named entity translations from comparable corpora, specifically, mining English and Chinese named entity translation. We first observe that existing approaches use one or more of the following named entity similarity metrics: entity, entity context, and relationship. Motivated by this observation, we propose a new holistic approach by 1) combining all similarity types used and 2) additionally considering relationship context similarity between pairs of named entities, a missing quadrant in the taxonomy of similarity metrics. We abstract the named entity translation problem as the matching of two named entity graphs extracted from the comparable corpora. Specifically, named entity graphs are first constructed from comparable corpora to extract relationship between named entities. Entity similarity and entity context similarity are then calculated from every pair of bilingual named entities. A reinforcing method is utilized to reflect relationship similarity and relationship context similarity between named entities. We also discover "latent" features lost in the graph extraction process and integrate this into our framework. According to our experimental results, our holistic graph-based approach and its enhancement using corpus latent features are highly effective and our framework significantly outperforms previous approaches. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1831
-
Kim 2022
Revisiting the Practical Effectiveness of Constituency Parse Extraction from Pre-trained Language Models
Proceedings - International Conference on Computational Linguistics, COLING 2022;29():5398-5408 Association for Computational Linguistics (ACL) 2022 Ref ID: 5348 Constituency Parse Extraction from Pre-trained Language Models (CPE-PLM) is a recent paradigm that attempts to induce constituency parse trees relying only on the internal knowledge of pre-trained language models. While attractive in the perspective that similar to in-context learning, it does not require task-specific fine-tuning, the practical effectiveness of such an approach still remains unclear, except that it can function as a probe for investigating language models’ inner workings. In this work, we mathematically reformulate CPE-PLM and propose two advanced ensemble methods tailored for it, demonstrating that the new parsing paradigm can be competitive with common unsupervised parsers by introducing a set of heterogeneous PLMs combined using our techniques. Furthermore, we explore some scenarios where the trees generated by CPE-PLM are practically useful. Specifically, we show that CPE-PLM is more effective than typical supervised parsers in few-shot settings. © 2022 Proceedings - International Conference on Computational Linguistics, COLING. All rights reserved. |
Davis
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#163
-
Kim 2021
Deep Learning-Based Knowledge Graph Generation for COVID-19
Many attempts have been made to construct new domain-specific knowledge graphs using the existing knowledge base of various domains. However, traditional "dictionary-based" or "supervised" knowledge graph building methods rely on predefined human-annotated resources of entities and their relationships. The cost of creating human-annotated resources is high in terms of both time and effort. This means that relying on human-annotated resources will not allow rapid adaptability in describing new knowledge when domain-specific information is added or updated very frequently, such as with the recent coronavirus disease-19 (COVID-19) pandemic situation. Therefore, in this study, we propose an Open Information Extraction (OpenIE) system based on unsupervised learning without a pre-built dataset. The proposed method obtains knowledge from a vast amount of text documents about COVID-19 rather than a general knowledge base and add this to the existing knowledge graph. First, we constructed a COVID-19 entity dictionary, and then we scraped a large text dataset related to COVID-19. Next, we constructed a COVID-19 perspective language model by fine-tuning the bidirectional encoder representations from transformer (BERT) pre-trained language model. Finally, we defined a new COVID-19-specific knowledge base by extracting connecting words between COVID-19 entities using the BERT self-attention weight from COVID-19 sentences. Experimental results demonstrated that the proposed Co-BERT model outperforms the original BERT in terms of mask prediction accuracy and metric for evaluation of translation with explicit ordering (METEOR) score. |
Xinchen
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1329
-
Kim 2024
Foundation Model for Biomedical Graphs: Integrating Knowledge Graphs and Protein Structures to Large Language Models
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;4():346-355 Association for Computational Linguistics (ACL) 2024 Ref ID: 4331 Transformer model has been a de-facto standard in natural language processing. Its adaptations in other fields such as computer vision showed promising results that this architecture is a powerful neural network in representation learning regardless of the data type. This recent success has led to research in multimodal Large Language Model (LLM), which enabled us to new types of tasks and applications with multiple data types. However, multimodal LLM in the biomedical domain is primarily limited to images, text, and/or sequence data. Here I propose to work on multimodal LLM architecture for biomedical graphs such as protein structure and chemical molecules. The research hypothesis is based on the fact that clinicians and researchers in computational biology and clinical research take advantage of various information for their decision-making process. Therefore, an AI model being able to handle multiple data types should boost its ability to use diverse knowledge for improved performances in clinical applications. ©2024 Association for Computational Linguistics. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#355
-
Kirk 2024
Improving Knowledge Extraction from LLMs for Task Learning through Agent Analysis
38th AAAI Conference on Artificial Intelligence (AAAI) / 36th Conference on Innovative Applications of Artificial Intelligence / 14th Symposium on Educational Advances in Artificial Intelligence 2024;():18390-18398 Vancouver, CANADA Assoc Advancement Artificial Intelligence 2024 Ref ID: 3747 Large language models (LLMs) offer significant promise as a knowledge source for task learning. Prompt engineering has been shown to be effective for eliciting knowledge from an LLM, but alone it is insufficient for acquiring relevant, situationally grounded knowledge for an embodied agent learning novel tasks. We describe a cognitive-agent approach, STARS, that extends and complements prompt engineering, mitigating its limitations and thus enabling an agent to acquire new task knowledge matched to its native language capabilities, embodiment, environment, and user preferences. The STARS approach is to increase the response space of LLMs and deploy general strategies, embedded within the autonomous agent, to evaluate, repair, and select among candidate responses produced by the LLM. We describe the approach and experiments that show how an agent, by retrieving and evaluating a breadth of responses from the LLM, can achieve 77- 94% task completion in one-shot learning without user oversight. The approach achieves 100% task completion when human oversight (such as an indication of preference) is provided. Further, the type of oversight largely shifts from explicit, natural language instruction to simple confirmation/discomfirmation of high-quality responses that have been vetted by the agent before presentation to a user. |
Davis
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3410
-
Kirk 2023
Exploiting Language Models as a Source of Knowledge for Cognitive Agents
arXiv 2023;(): 2023 Ref ID: 7883 Large language models (LLMs) provide capabilities far beyond sentence completion, including question answering, summarization, and natural-language inference. While many of these capabilities have potential application to cognitive systems, our research is exploiting language models as a source of task knowledge for cognitive agents, that is, agents realized via a cognitive architecture. We identify challenges and opportunities for using language models as an external knowledge source for cognitive systems and possible ways to improve the effectiveness of knowledge extraction by integrating extraction with cognitive architecture capabilities, highlighting with examples from our recent work in this area. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1981
-
Knez 2024
Towards using Automatically Enhanced Knowledge Graphs to Aid Temporal Relation Extraction
1st Workshop on Patient-Oriented Language Processing, CL4Health 2024 at LREC-COLING 2024 - Workshop Proceedings 2024;():131-136 European Language Resources Association (ELRA) 2024 Ref ID: 4650 Temporal relation extraction in medical document analysis is crucial for understanding patient histories and treatment outcomes. This paper introduces a novel approach leveraging a bimodal model integrating textual content and a knowledge graph to enhance temporal relation extraction. The paper presents ongoing research on constructing an optimal knowledge graph by augmenting PrimeKG with dynamically expanded information using a language model-generated knowledge graph. It also further personalizes the information with patient-specific graphs tailored for relation prediction. The pipeline for constructing this enriched knowledge graph is detailed, aiming to improve the capabilities of temporal relation extraction models. The preliminary results show that adding a simple knowledge graph to the temporal relation extraction model can significantly increase the performance, achieving new state-of-the-art results. While research on enhanced knowledge graphs is ongoing, this paper lays the groundwork for leveraging common knowledge to advance temporal relation extraction in medical contexts. This approach holds promise for enhancing the understanding of patient histories and treatment outcomes, potentially leading to improved healthcare decision-making and patient care. © 2024 ELRA Language Resource Association: CC BY-NC 4.0. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3408
-
Ko 2024
Evidence-Focused Fact Summarization for Knowledge-Augmented Zero-Shot Question Answering
arXiv 2024;(): 2024 Ref ID: 8161 Recent studies have investigated utilizing Knowledge Graphs (KGs) to enhance Quesetion Answering (QA) performance of Large Language Models (LLMs), yet structured KG verbalization remains challengin. Existing methods, such as triple-form or free-form textual conversion of triple-form facts, encounter several issues. These include reduced evidence density due to duplicated entities or relationships, and reduced evidence clarity due to an inability to emphasize crucial evidence. To address these issues, we propose EFSum, an Evidence-focused Fact Summarization framework for enhanced QA with knowledge-augmented LLMs. We optimize an open-source LLM as a fact summarizer through distillation and preference alignment. Our extensive experiments show that EFSum improves LLM's zero-shot QA performance, and it is possible to ensure both the helpfulness and faithfulness of the summary. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#444
-
Koloski 2022
Knowledge graph informed fake news classification via heterogeneous representation ensembles
Increasing amounts of freely available data both in textual and relational form offers exploration of richer document representations, potentially improving the model performance and robustness. An emerging problem in the modern era is fake news detection-many easily available pieces of information are not necessarily factually correct, and can lead to wrong conclusions or are used for manipulation. In this work we explore how different document representations, ranging from simple symbolic bag-of-words, to contextual, neural language model-based ones can be used for efficient fake news identification. One of the key contributions is a set of novel document representation learning methods based solely on knowledge graphs, i.e., extensive collections of (grounded) subject-predicate-object triplets. We demonstrate that knowledge graph-based representations already achieve competitive performance to conventionally accepted representation learners. Furthermore, when combined with existing, contextual representations, knowledge graph-based document representations can achieve state-of-the-art performance. To our knowledge this is the first larger-scale evaluation of how knowledge graph-based representations can be systematically incorporated into the process of fake news classification. (c) 2022 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3440
-
Kommineni 2024
From human experts to machines: An LLM supported approach to ontology and knowledge graph construction
arXiv 2024;(): 2024 Ref ID: 8179 The conventional process of building Ontologies and Knowledge Graphs (KGs) heavily relies on human domain experts to define entities and relationship types, establish hierarchies, maintain relevance to the domain, fill the ABox (or populate with instances), and ensure data quality (including amongst others accuracy and completeness). On the other hand, Large Language Models (LLMs) have recently gained popularity for their ability to understand and generate human-like natural language, offering promising ways to automate aspects of this process. This work explores the (semi-)automatic construction of KGs facilitated by open-source LLMs. Our pipeline involves formulating competency questions (CQs), developing an ontology (TBox) based on these CQs, constructing KGs using the developed ontology, and evaluating the resultant KG with minimal to no involvement of human experts. We showcase the feasibility of our semi-automated pipeline by creating a KG on deep learning methodologies by exploiting scholarly publications. To evaluate the answers generated via Retrieval-Augmented-Generation (RAG) as well as the KG concepts automatically extracted using LLMs, we design a judge LLM, which rates the generated content based on ground truth. Our findings suggest that employing LLMs could potentially reduce the human effort involved in the construction of KGs, although a human-in-the-loop approach is recommended to evaluate automatically generated KGs. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1009
-
Kong 2024
Automated Knowledge Mining and Knowledge Graph Reasoning for Aircraft Engine Maintenance
ACM International Conference Proceeding Series 2024;():35-40 Association for Computing Machinery 2024 DOI: 10.1145/3689218.3689221 · Ref ID: 3837 The maintenance process for aircraft engines is fraught with significant challenges due to their inherent complexity. Large Language Models excel in general Natural Language Processing tasks, yet they lack domain-specific knowledge, thereby compromising their performance in specialized areas. The varied descriptions of engine faults also render traditional text matching algorithms unsuitable for this maintenance domain. In this paper, we construct a knowledge graph integrated with fault diagnosis reasoning ability with knowledge mined from aircraft engine maintenance data. Firstly, we propose the Knowledge Mining and Knowledge Graph Reasoning framework for aircraft engine maintenance data knowledge mining and aircraft engine fault diagnosis. Secondly, we utilize prompt with in-context learning to mitigate the issue of the model lacking expertise in the field of aircraft engine maintenance. Finally, we adopt a sentence similarity calculation method based on BERT, which enables more effective processing of semantic information. We apply our method to Aircraft Engine Fault dataset which is collected from maintenance records of civil aircraft engine since 2007 to 2015, and experimental results demonstrate the effectiveness of our knowledge mining method and aircraft engine fault reasoning algorithm. © 2024 ACM. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2017
-
Koohborfardhaghighi 2024
Unlocking the Power of LLM-Based Question Answering Systems: Enhancing Reasoning, Insight, and Automation with Knowledge Graphs
Lecture Notes in Networks and Systems 2024;1052 LNNS():156-171 Springer Science and Business Media Deutschland GmbH 2024 DOI: 10.1007/978-3-031-64776-5_16 · Ref ID: 4409 In today’s data-driven business landscape, Knowledge Graphs can be effectively layered on top of relational databases and ontologies, a powerful combination for transforming how businesses tackle complex queries and decision-making processes. In this paper, we present a series of experiments that demonstrate the opportunities and advantages of blending knowledge graphs with Large Language Models (LLMs) through a practical use case. Our experimental results provide insights into the reasoning capabilities of LLMs when utilizing Knowledge Graph-Prompting. Furthermore, we observed the significance of maintaining uniformity in the language employed during knowledge graph construction to ensure precise responses from LLMs when querying the knowledge graph. This consistency also resonates in the embedding space of the model, where elements like relationship types are reflected in the resulting embeddings. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024. |
Mike
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1096
-
Korini 2023
Column Type Annotation using ChatGPT
CEUR Workshop Proceedings 2023;3462(): CEUR-WS 2023 Ref ID: 5242 Column type annotation is the task of annotating the columns of a relational table with the semantic type of the values contained in each column. Column type annotation is an important pre-processing step for data search and data integration in the context of data lakes. State-of-the-art column type annotation methods either rely on matching table columns to properties of a knowledge graph or fine-tune pre-trained language models such as BERT for column type annotation. In this work, we take a different approach and explore using ChatGPT for column type annotation. We evaluate different prompt designs in zero- and few-shot settings and experiment with providing task definitions and detailed instructions to the model. We further implement a two-step table annotation pipeline which first determines the class of the entities described in the table and depending on this class asks ChatGPT to annotate columns using only the relevant subset of the overall vocabulary. Using instructions as well as the two-step pipeline, ChatGPT reaches F1 scores of over 85% in zero- and one-shot setups. To reach a similar F1 score a RoBERTa model needs to be fine-tuned with 356 examples. This comparison shows that ChatGPT is able deliver competitive results for the column type annotation task given no or only a minimal amount of task-specific demonstrations. © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). |
Kwesi
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1898
-
Kosten 2023
Spider4SPARQL: A Complex Benchmark for Evaluating Knowledge Graph Question Answering Systems
Proceedings - 2023 IEEE International Conference on Big Data, BigData 2023 2023;():5272-5281 Institute of Electrical and Electronics Engineers Inc. 2023 DOI: 10.1109/BigData59044.2023.10386182 · Ref ID: 4933 With the recent spike in the number and availability of Large Language Models (LLMs), it has become increasingly important to provide large and realistic benchmarks for evaluating Knowledge Graph Question Answering (KGQA) systems. So far the majority of benchmarks rely on pattern-based SPARQL query generation approaches. The subsequent natural language (NL) question generation is conducted through crowdsourcing or other automated methods, such as rule-based paraphrasing or NL question templates. Although some of these datasets are of considerable size, their pitfall lies in their pattern-based generation approaches, which do not always generalize well to the vague and linguistically diverse questions asked by humans in real-world contexts. In this paper, we introduce Spider4SPARQL -a new SPARQL benchmark dataset featuring 9,693 previously existing manually generated NL questions and 4,721 unique, novel, and complex SPARQL queries of varying complexity. In addition to the NL/SPARQL pairs, we also provide their corresponding 166 knowledge graphs and ontologies, which cover 138 different domains. Our complex benchmark enables novel ways of evaluating the strengths and weaknesses of modern KGQA systems. We evaluate the system with state-of-the-art KGQA systems as well as LLMs, which achieve only up to 45% execution accuracy, demonstrating that Spider4SPARQL is a challenging benchmark for future research. © 2023 IEEE. |
Mike
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#108
-
Koto 2022
Cloze Evaluation for Deeper Understanding of Commonsense Stories in Indonesian
1st Workshop on Commonsense Representation and Reasoning (CSRR) 2022;():8-16 Dublin, IRELAND Assoc Computational Linguistics-Acl 2022 Ref ID: 3707 Story comprehension that involves complex causal and temporal relations is a critical task in NLP, but previous studies have focused predominantly on English, leaving open the question of how the findings generalize to other languages, such as Indonesian. In this paper, we follow the Story Cloze Test framework of Mostafazadeh et al. (2016) in evaluating story understanding in Indonesian, by constructing a four-sentence story with one correct ending and one incorrect ending. To investigate commonsense knowledge acquisition in language models, we experimented with: (1) a classification task to predict the correct ending; and (2) a generation task to complete the story with a single sentence. We investigate these tasks in two settings: (i) monolingual training and (ii) zero-shot cross-lingual transfer between Indonesian and English. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1103
-
Krishnan 2020
Common-knowledge concept recognition for SEVA
CEUR Workshop Proceedings 2020;2600(): CEUR-WS 2020 Ref ID: 5792 We build a common-knowledge concept recognition system for a Systems Engineer’s Virtual Assistant (SEVA) which can be used for downstream tasks such as relation extraction, knowledge graph construction, and question-answering. The problem is formulated as a token classification task similar to named entity extraction. With the help of a domain expert and text processing methods, we construct a dataset annotated at the word-level by carefully defining a labelling scheme to train a sequence model to recognize systems engineering concepts. We use a pre-trained language model and fine-tune it with the labeled dataset of concepts. In addition, we also create some essential datasets for information such as abbreviations and definitions from the systems engineering domain. Finally, we construct a simple knowledge graph using these extracted concepts along with some hyponym relations. Copyright © 2020 held by the author(s). |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3002
-
Kroll 2021
A Toolbox for the Nearly-Unsupervised Construction of Digital Library Knowledge Graphs
2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL) 2021;():21-30 2021 DOI: 10.1109/JCDL52503.2021.00014 · Ref ID: 6085 Knowledge graphs are essential for digital libraries to store entity-centric knowledge. The applications of knowledge graphs range from summarizing entity information over answering complex queries to inferring new knowledge. Yet, building knowledge graphs means either relying on manual curation or designing supervised extraction processes to harvest knowledge from unstructured text. Obviously, both approaches are cost-intensive. Yet, the question is whether we can minimize the efforts to build a knowledge graph. And indeed, we propose a toolbox that provides methods to extract knowledge from arbitrary text. Our toolkit bypasses the need for supervision nearly completely and includes a novel algorithm to close the missing gaps. As a practical demonstration, we analyze our toolbox on established biomedical benchmarks. As far as we know, we are the first who propose, analyze and share a nearly unsupervised and complete toolbox for building knowledge graphs from text. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1824
-
Kruit 2024
Retrieval-based Question Answering with Passage Expansion using a Knowledge Graph
2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():14063-14072 European Language Resources Association (ELRA) 2024 Ref ID: 4600 Recent advancements in dense neural retrievers and language models have led to large improvements in state-of-the-art approaches to open-domain Question Answering (QA) based on retriever-reader architectures. However, issues stemming from data quality and imbalances in the use of dense embeddings have hindered performance, particularly for less common entities and facts. To tackle these problems, this study explores a multi-modal passage retrieval model's potential to bolster QA system performance. This study poses three key questions: (1) Can a distantly supervised question-relation extraction model enhance retrieval using a knowledge graph (KG), compensating for dense neural retrievers' shortcomings with rare entities? (2) How does this multi-modal approach compare to existing QA systems based on textual features? (3) Can this QA system alleviate poor performance on less common entities on common benchmarks? We devise a multi-modal retriever combining entity features and textual data, leading to improved retrieval precision in some situations, particularly for less common entities. Experiments across different datasets confirm enhanced performance for entity-centric questions, but challenges remain in handling complex generalized questions. © 2024 ELRA Language Resource Association: CC BY-NC 4.0. |
Kwesi
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2013
-
Kuang 2023
Unleashing the Power of Language Models in Text-Attributed Graph
Findings of the Association for Computational Linguistics: EMNLP 2023 2023;():8429-8441 Association for Computational Linguistics (ACL) 2023 Ref ID: 5076 Representation learning on graph has been demonstrated to be a powerful tool for solving real-world problems. Text-attributed graph carries both semantic and structural information among different types of graphs. Existing works have paved the way for knowledge extraction of this type of data by leveraging language models or graph neural networks or combination of them. However, these works suffer from issues like underutilization of relationships between nodes or words or unaffordable memory cost. In this paper, we propose a Node Representation Update Pre-training Architecture based on Co-modeling Text and Graph (NRUP). In NRUP, we construct a hierarchical text-attributed graph that incorporates both initial nodes and word nodes. Meanwhile, we apply four self-supervised tasks for different level of constructed graph. We further design the pre-training framework to update the features of nodes during training epochs. We conduct the experiment on the benchmark dataset ogbn-arxiv. Our method achieves outperformance compared to baselines, fully demonstrating its validity and generalization. © 2023 Association for Computational Linguistics. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1370
-
Kuang 2024
Harnessing multimodal large language models for traffic knowledge graph generation and decision-making
|
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#749
-
Kuegler 2022
A Semantic Annotation Pipeline towards the Generation of Knowledge Graphs in Tribology
Within the domain of tribology, enterprises and research institutions are constantly working on new concepts, materials, lubricants, or surface technologies for a wide range of applications. This is also reflected in the continuously growing number of publications, which in turn serve as guidance and benchmark for researchers and developers. Due to the lack of suited data and knowledge bases, knowledge acquisition and aggregation is still a manual process involving the time-consuming review of literature. Therefore, semantic annotation and natural language processing (NLP) techniques can decrease this manual effort by providing a semi-automatic support in knowledge acquisition. The generation of knowledge graphs as a structured information format from textual sources promises improved reuse and retrieval of information acquired from scientific literature. Motivated by this, the contribution introduces a novel semantic annotation pipeline for generating knowledge in the domain of tribology. The pipeline is built on Bidirectional Encoder Representations from Transformers (BERT)-a state-of-the-art language model-and involves classic NLP tasks like information extraction, named entity recognition and question answering. Within this contribution, the three modules of the pipeline for document extraction, annotation, and analysis are introduced. Based on a comparison with a manual annotation of publications on tribological model testing, satisfactory performance is verified. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1088
-
Kulkarni 2023
Cognitive Retrieve: Empowering Document Retrieval with Semantics and Domain-Specific Knowledge Graph
CEUR Workshop Proceedings 2023;3532(): CEUR-WS 2023 Ref ID: 5196 As the data landscape continues to expand, the task of identifying relevant documents becomes increasingly complex, especially when dealing with diverse and varied data sources. Traditional keyword-based search systems struggle to capture the subtle contextual meaning of search queries. Semantic-based search, leveraging open data knowledge graphs, offers a solution by understanding contextual meaning. However, its effectiveness relies heavily on the quality and completeness of the underlying data used to define these semantics. However, incomplete data can lead to spurious results and a lack of relevance in the retrieved documents. To bridge this gap between user search interest and retrieval outcomes, we propose integrating domain-specific alignment into the search process. Our research aims to achieve this through the development of a semantic-driven data processing pipeline, laying the foundation for seamless semantic-oriented retrieval. This approach includes metadata extraction, considering domain-specific keywords and structural metadata from heterogeneous data sources. We enhance metadata by identifying latent terms using language models. Furthermore, we incorporate latent concepts and domain-specific information gathered from domain experts into a special knowledge graph construct- a ‘concept graph’. Our primary focus is on identifying relevant concepts from this graph, aligning with semantic and contextual aspects of the specified search intent. Our proposed document retrieval system, which combines the concept graph with semantics, is implemented using data from the Government of Karnataka, India. This approach addresses the administrative need to extract relevant documents from data silos, offering an alternative approach to traditional methods. Extensive evaluations demonstrate the proposed system’s superior performance in terms of true positive results compared to baseline systems like Lucene, Elasticsearch, and Doc2Vec. © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3493
-
Kulkarni 2024
HeCiX: Integrating Knowledge Graphs and Large Language Models for Biomedical Research
arXiv 2024;(): 2024 Ref ID: 8474 Despite advancements in drug development strategies, 90% of clinical trials fail. This suggests overlooked aspects in target validation and drug optimization. In order to address this, we introduce HeCiX-KG, Hetionet-Clinicaltrials neXus Knowledge Graph, a novel fusion of data from ClinicalTrials.gov and Hetionet in a single knowledge graph. HeCiX-KG combines data on previously conducted clinical trials from ClinicalTrials.gov, and domain expertise on diseases and genes from Hetionet. This offers a thorough resource for clinical researchers. Further, we introduce HeCiX, a system that uses LangChain to integrate HeCiX-KG with GPT-4, and increase its usability. HeCiX shows high performance during evaluation against a range of clinically relevant issues, proving this model to be promising for enhancing the effectiveness of clinical research. Thus, this approach provides a more holistic view of clinical trials and existing biological data. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3492
-
Kulumba 2024
Harvesting Textual and Structured Data from the HAL Publication Repository
arXiv 2024;(): 2024 Ref ID: 8496 HAL (Hyper Articles en Ligne) is the French national publication repository, used by most higher education and research organizations for their open science policy. As a digital library, it is a rich repository of scholarly documents, but its potential for advanced research has been underutilized. We present HALvest, a unique dataset that bridges the gap between citation networks and the full text of papers submitted on HAL. We craft our dataset by filtering HAL for scholarly publications, resulting in approximately 700,000 documents, spanning 34 languages across 13 identified domains, suitable for language model training, and yielding approximately 16.5 billion tokens (with 8 billion in French and 7 billion in English, the most represented languages). We transform the metadata of each paper into a citation network, producing a directed heterogeneous graph. This graph includes uniquely identified authors on HAL, as well as all open submitted papers, and their citations. We provide a baseline for authorship attribution using the dataset, implement a range of state-of-the-art models in graph representation learning for link prediction, and discuss the usefulness of our generated knowledge graph structure. |
Kwesi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#403
-
Kumar 2022
K-LM: Knowledge Augmenting in Language Models Within the Scholarly Domain
The use of superior algorithms and complex architectures in language models have successfully imparted human-like abilities to machines for specific tasks. But two significant constraints, the available training data size and the understanding of domain-specific context, hamper the pre-trained language models from optimal and reliable performance. A potential solution to tackle these limitations is to equip the language models with domain knowledge. While the commonly adopted techniques use Knowledge Graphs Embeddings (KGEs) to inject domain knowledge, we provide a Knowledge Language Model (K-LM) to use the Resource Description Framework (RDF) triples directly, extracted from world knowledge bases. The proposed model works in conjunction with Generative Pretrained Transformer (GPT-2) and Bidirectional Encoder Representations from Transformers (BERT) and uses a well-defined pipeline to select, categorize, and filter the RDF triples. In addition, we introduce heuristic methods to inject domain-specific knowledge in K-LM, leveraging knowledge graphs (KGs). We tested our approaches on the classification task within the scholarly domain using two KGs, and our results show that our proposed language model has significantly outperformed the baselines and BERT for each KG. Our experimental findings also help us conclude the importance of relevance of KG used over the quantity of injected RDF triples. Also, each of our proposed methods for injecting the RDF triples has increased the overall model's accuracy, demonstrating that K-LM is a potential choice for domain adaptation to solve knowledge-driven problems. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#558
-
Kumichev 2024
MedSyn: LLM-Based Synthetic Medical Text Generation Framework
Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD) 2024;14950():215-230 Vilnius, LITHUANIA Springer International Publishing Ag 2024 DOI: 10.1007/978-3-031-70381-2_14 · Ref ID: 3384 Generating synthetic text addresses the challenge of data availability in privacy-sensitive domains such as healthcare. This study explores the applicability of synthetic data in real-world medical settings. We introduce MedSyn, a novel medical text generation framework that integrates large language models with a Medical Knowledge Graph (MKG). We use MKG to sample prior medical information for the prompt and generate synthetic clinical notes with GPT-4 and fine-tuned LLaMA models. We assess the benefit of synthetic data through application in the ICD code prediction task. Our research indicates that synthetic data can increase the classification accuracy of vital and challenging codes by up to 17.8% compared to settings without synthetic data. Furthermore, to provide new data for further research in the healthcare domain, we present the largest open-source synthetic dataset of clinical notes for the Russian language, comprising over 41k samples covering 219 ICD-10 codes. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3031
-
Kunze 2011
Towards semantic robot description languages
2011 IEEE International Conference on Robotics and Automation 2011;():5589-5595 2011 DOI: 10.1109/ICRA.2011.5980170 · Ref ID: 6827 There is a semantic gap between simple but high-level action instructions like “Pick up the cup with the right hand” and low-level robot descriptions that model, for example, the structure and kinematics of a robot's manipulator. Currently, programmers bridge this gap by mapping abstract instructions to parametrized algorithms and rigid body parts of a robot within their control programs. By linking descriptions of robot components, i.e. sensors, actuators and control programs, via capabilities to actions in an ontology we equip robots with knowledge about themselves that allows them to infer the required components for performing a given action. Thereby a robot that is instructed by an end-user, a programmer, or even another robot to perform a certain action, can assess itself whether it is able and how to perform the requested action. This self-knowledge for robots could considerably change the way of robot control, robot interaction, robot programming, and multi-robot communication. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1688
-
Kurdiukov 2024
nlp_enjoyers at TextGraphs-17 Shared Task: Text-Graph Representations for Knowledge Graph Question Answering using all-MPNet
TextGraphs at ACL 2024 - Proceedings of TextGraphs-17: Graph-Based Methods for Natural Language Processing, 62nd Annual Meeting of the Association of Computational Linguistics 2024;():126-130 Association for Computational Linguistics (ACL) 2024 Ref ID: 4234 This paper presents a model for solving the Multiple Choice Question Answering (MCQA) problem, focusing on the impact of subgraph extraction from a Knowledge Graph on model performance. The proposed method combines textual and graph information by adding linearized subgraphs directly into the main question prompt with separate tokens, enhancing the performance of models working with each modality separately. The study also includes an examination of Large Language Model (LLM) backbones and the benefits of linearized subgraphs and sequence length, with efficient training achieved through fine-tuning with LoRA. The top benchmark, using subgraphs and MPNet, achieved an F1 score of 0.3887. The main limitation of the experiments is the reliance on pre-generated subgraphs/triplets from the graph, and the lack of exploration of in-context learning and prompting strategies with decoder-based architectures. © 2024 Association for Computational Linguistics. |
yuexi
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1345
-
Lageweg 2024
GECKO: A Question Answering System for Official Statistics
CEUR Workshop Proceedings 2024;3759(): CEUR-WS 2024 Ref ID: 4246 This paper presents GECKO, a knowledge graph-based statistical question answering system currently in beta deployment.GECKO aims to facilitate the retrieval of single statistical values from an extensive database containing over a billion values across more than 4,000 tables.The system integrates a comprehensive framework including data augmentation, entity retrieval, and large language model (LLM)-based query generation.A key feature of the beta deployment is the collection of user feedback, which is critical for improving system performance and accuracy.This feedback mechanism allows users to report issues directly, ensuring continuous improvement based on real-world use. © 2024 Copyright for this paper by its authors. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#307
-
Lageweg 2024
Generative Expression Constrained Knowledge-Based Decoding for Open Data
21st International Conference on The Semantic Web (ESWC) 2024;14664():307-325 Hersonissos, GREECE Springer International Publishing Ag 2024 DOI: 10.1007/978-3-031-60626-7_17 · Ref ID: 3304 In this paper, we present GECKO, a knowledge graph question answering (KGQA) system for data from Statistics Netherlands (Centraal Bureau voor de Statistiek). QA poses great challenges in means of generating relevant answers, as well as preventing hallucinations. This is a phenomenon found in language models and creates issues when attempting factual QA with these models alone. To overcome these limitations, the Statistics Netherlands' publicly available OData4 data was used to create a knowledge graph, in which the answer generation decoding process is grounded, ensuring faithful answers. When processing a question, GECKO performs entity and schema retrieval, does schema-constrained expression decoding, makes assumptions where needed and executes the generated expression as an OData4 query to retrieve information. A novel method was implemented to perform the constrained knowledge-based expression decoding using an encoder-decoder model. Both a sparse and dense entity retrieval method were evaluated. While the encoder-decoder model did not achieve production-ready performance, experiments show promising results for a rule-based baseline using a sparse entity retriever. Additionally, the results of qualitative user testing were positive. We therefore formulate recommendations for deployment help guide users of Statistics Netherlands data to their answers more quickly. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2421
-
Lai 2021
Extracting Semantics of Predicates From Millions of Bio-Medical Abstracts for Inferencing New Biological Key Events and Relationships
2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) 2021;():3484-3491 2021 DOI: 10.1109/BIBM52615.2021.9669549 · Ref ID: 6201 Adverse outcome pathways (AOP) structure toxicological knowledge as sequential, directed chains of key events (KE) that culminate in adverse outcomes (AOs). AOP development is a laborious process that involves extensive knowledge mining and could be improved via use of machine / deep learning. In this paper, we present an artificial intelligence system that can accelerate putative AOP development process by inferencing new AOP modules based on the knowledge learned from 16-million pre-parsed PubMed abstracts. Each AOP modules is represented as a triplet that consists of antecedent and consequent biological entities connected by a relation. Users can also investigate specific types of antecedent, consequent, and relations by specifying macro/microtemplates using the MeSH semantic type hierarchy. We also provide visualizations to illustrate the hidden semantics that our system can extract from input triplets. |
Ishan
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1792
-
Laigle 2020
REEF: A framework for information extraction and automated knowledge graph construction
1st EAGE Digitalization Conference and Exhibition 2020;(): European Association of Geoscientists and Engineers, EAGE 2020 DOI: 10.3997/2214-4609.202032092 · Ref ID: 5787 We present Reef (Recursive Evidence Extraction Framework), a Python framework for automated information extraction from Petroleum Geoscience databases. Reef enables an end to end pipeline from raw documents to a Knowledge Graph. Reef makes possible two essential operations: 1/ discover entities in documents, characterize them and connect them to abstract concepts present in a knowledge graph and 2/ discover new knowledge with distant supervision. Knowledge graphs are key to build better search engines, Question Answering systems, recommendation engines, feed algorithms for the cross analysis of multiple datasets. Reef unique approach leverages a comprehensive stack of open source and state-of-the-art libraries for documents digitalization and parsing, Natural Language Processing, Language Modeling, Logic Reasoning and Graph Analysis. These foundational components are seconded by custom applications for specific tasks. Documents processed in Reef are digitized and sent through a pipeline where their content is filtered according to a flexible, easily extensible, Petroleum Geoscience specific object model. Information can be extracted from text, tables, figures, diagrams. Reef contains functions to infer information nature, digitize it, disambiguate and reconcile it into a graph database. Reef can be deployed in any cloud and delivers production ready knowledge graphs which can be served to third party applications. © EAGE 2019. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3555
-
Lairgi 2024
iText2KG: Incremental Knowledge Graphs Construction Using Large Language Models
arXiv 2024;(): 2024 Ref ID: 8579 Most available data is unstructured, making it challenging to access valuable information. Automatically building Knowledge Graphs (KGs) is crucial for structuring data and making it accessible, allowing users to search for information effectively. KGs also facilitate insights, inference, and reasoning. Traditional NLP methods, such as named entity recognition and relation extraction, are key in information retrieval but face limitations, including the use of predefined entity types and the need for supervised learning. Current research leverages large language models' capabilities, such as zero- or few-shot learning. However, unresolved and semantically duplicated entities and relations still pose challenges, leading to inconsistent graphs and requiring extensive post-processing. Additionally, most approaches are topic-dependent. In this paper, we propose iText2KG, a method for incremental, topic-independent KG construction without post-processing. This plug-and-play, zero-shot method is applicable across a wide range of KG construction scenarios and comprises four modules: Document Distiller, Incremental Entity Extractor, Incremental Relation Extractor, and Graph Integrator and Visualization. Our method demonstrates superior performance compared to baseline methods across three scenarios: converting scientific papers to graphs, websites to graphs, and CVs to graphs. |
Davis
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1591
-
Laleye 2023
Leveraging Knowledge Graph Embeddings to Enhance Contextual Representations for Relation Extraction
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2023;14194 LNCS():19-31 Springer Science and Business Media Deutschland GmbH 2023 DOI: 10.1007/978-3-031-41501-2_2 · Ref ID: 5286 Relation extraction task is a crucial and challenging aspect of Natural Language Processing. Several methods have surfaced as of late, exhibiting notable performance in addressing the task; however, most of these approaches rely on vast amounts of data from large-scale knowledge graphs or language models pretrained on voluminous corpora. In this paper, we hone in on the effective utilization of solely the knowledge supplied by a corpus to create a high-performing model. Our objective is to showcase that by leveraging the hierarchical structure and relational distribution of entities within a corpus without introducing external knowledge, a relation extraction model can achieve significantly enhanced performance. We therefore proposed a relation extraction approach based on the incorporation of pretrained knowledge graph embeddings at the corpus scale into the sentence-level contextual representation. We conducted a series of experiments which revealed promising and very interesting results for our proposed approach. The obtained results demonstrated an outperformance of our method compared to context-based relation extraction models. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1614
-
Lan 2024
LLM4QA: Leveraging Large Language Model for Efficient Knowledge Graph Reasoning with SPARQL Query
As one of the core technologies of general artificial intelligence, knowledge graph reasoning aims to infer new knowledge from existing knowledge in the knowledge base, providing decision support for knowledge-driven intelligent information services such as information retrieval, question answering, and recommendation systems. However, there are still some issues, such as poor interpretability and low reasoning efficiency, always decrease the current knowledge reasoning performance. To tackle the challenges, this paper proposes a knowledge graph reasoning method LLM4QA, which leverages fine-tuned large language models with chain-of-thought to generate graph query languages SPARQL (i.e., SPARQL Protocol and RDF Query Language) for reasoning. Firstly, an efficient instruction fine-tuning method is applied to fine-tune open-source large language models with chain-of-thought. Then, the fine-tuned open-source large model is used to convert natural language questions into logical forms. Finally, we utilize unsupervised entity relationship retrieval to generate graph database query languages, real-izing a natural language knowledge graph question-answering framework. Experimental results demonstrate that this method achieves well performance in terms of inference accuracy and significantly improves model retrieval efficiency. © 2024 by the authors. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#157
-
Lanchantin 2023
A Data Source for Reasoning Embodied Agents
37th AAAI Conference on Artificial Intelligence (AAAI) / 35th Conference on Innovative Applications of Artificial Intelligence / 13th Symposium on Educational Advances in Artificial Intelligence 2023;():8438-8446 Washington, DC Assoc Advancement Artificial Intelligence 2023 Ref ID: 3390 Recent progress in using machine learning models for reasoning tasks has been driven by novel model architectures, large-scale pre-training protocols, and dedicated reasoning datasets for fine-tuning. In this work, to further pursue these advances, we introduce a new data generator for machine reasoning that integrates with an embodied agent. The generated data consists of templated text queries and answers, matched with world-states encoded into a database. The world-states are a result of both world dynamics and the actions of the agent. We show the results of several baseline models on instantiations of train sets. These include pre-trained language models fine-tuned on a text-formatted representation of the database, and graph-structured Transformers operating on a knowledge-graph representation of the database. We find that these models can answer some questions about the world-state, but struggle with others. These results hint at new research directions in designing neural reasoning models and database representations. Code to generate the data and train the models will be released at github.com/facebookresearch/neuralmemory. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2221
-
Lau 2016
CASPR: A comprehensive cable-robot analysis and simulation platform for the research of cable-driven parallel robots
2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2016;():3004-3011 2016 DOI: 10.1109/IROS.2016.7759465 · Ref ID: 6973 The study of cable-driven parallel robots (CDPRs) has attracted much attention in recent years. However, to the best of the authors' knowledge, no single software platform exists for researchers to perform different types of analyses for CDPRs of arbitrary structure. In this paper, the Cable-robot Analysis and Simulation Platform for Research (CASPR) of CDPRs is introduced. Using this platform, arbitrary types and structures of CDPRs, such as single and multi-link CDPRs, can be studied for a wide range of analyses, including kinematics, dynamics, control and workspace analysis. CASPR achieves this using a general CDPR model representation and an abstracted software architecture. Moveover, CDPRs can be defined using Extensible Markup Language (XML) with out-of-the-box availability of an extensive range of robots and analysis tools. The open-source platform aims to provide both a communal environment for the researchers to use and add models and algorithms to. The example case studies demonstrate the potential to perform analysis on CDPRs, directly compare algorithms and conveniently add new models and analyses. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#19
-
Lawley 2023
Applications of Natural Language Processing to Geoscience Text Data and Prospectivity Modeling
Geological maps are powerful models for visualizing the complex distribution of rock types through space and time. However, the descriptive information that forms the basis for a preferred map interpretation is typically stored in geological map databases as unstructured text data that are difficult to use in practice. Herein we apply natural language processing (NLP) to geoscientific text data from Canada, the U.S., and Australia to address that knowledge gap. First, rock descriptions, geological ages, lithostratigraphic and lithodemic information, and other long-form text data are translated to numerical vectors, i.e., a word embedding, using a geoscience language model. Network analysis of word associations, nearest neighbors, and principal component analysis are then used to extract meaningful semantic relationships between rock types. We further demonstrate using simple Naive Bayes classifiers and the area under receiver operating characteristics plots (AUC) how word vectors can be used to: (1) predict the locations of "pegmatitic" (AUC = 0.962) and "alkalic" (AUC = 0.938) rocks; (2) predict mineral potential for Mississippi-Valley-type (AUC = 0.868) and clastic-dominated (AUC = 0.809) Zn-Pb deposits; and (3) search geoscientific text data for analogues of the giant Mount Isa clastic-dominated Zn-Pb deposit using the cosine similarities between word vectors. This form of semantic search is a promising NLP approach for assessing mineral potential with limited training data. Overall, the results highlight how geoscience language models and NLP can be used to extract new knowledge from unstructured text data and reduce the mineral exploration search space for critical raw materials. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#311
-
Lawley 2022
Geoscience language models and their intrinsic evaluation
Geoscientists use observations and descriptions of the rock record to study the origins and history of our planet, which has resulted in a vast volume of scientific literature. Recent progress in natural language processing (NLP) has the potential to parse through and extract knowledge from unstructured text, but there has, so far, been only limited work on the concepts and vocabularies that are specific to geoscience. Herein we harvest and process public geoscientific reports (i.e., Canadian federal and provincial geological survey publications databases) and a subset of open access and peer-reviewed publications to train new, geoscience-specific language models to address that knowledge gap. Language model performance is validated using a series of new geoscience-specific NLP tasks (i.e., analogies, clustering, relatedness, and nearest neighbour analysis) that were developed as part of the current study. The raw and processed national geological survey corpora, language models, and evaluation criteria are all made public for the first time. We demonstrate that non-contextual (i.e., Global Vectors for Word Representation, GloVe) and contextual (i.e., Bidirectional Encoder Representations from Transformers, BERT) language models updated using the geoscientific corpora outperform the generic versions of these models for each of the evaluation criteria. Principal component analysis further demonstrates that word embeddings trained on geoscientific text capture meaningful semantic relationships, including rock classifications, mineral properties and compositions, and the geochemical behaviour of elements. Semantic relationships that emerge from the vector space have the potential to unlock latent knowledge within unstructured text, and perhaps more importantly, also highlight the potential for other downstream geoscience-focused NLP tasks (e.g., keyword prediction, document similarity, recommender systems, rock and mineral classification). |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1363
-
Le 2024
GraphLingo: Domain Knowledge Exploration by Synchronizing Knowledge Graphs and Large Language Models
Proceedings - International Conference on Data Engineering 2024;():5477-5480 IEEE Computer Society 2024 DOI: 10.1109/ICDE60146.2024.00432 · Ref ID: 4422 Knowledge graphs (KGs) are routinely curated to provide factual data for various domain-specific analyses. Nevertheless, it remains nontrivial to explore domain knowledge with standard query languages. We demonstrate GraphLingo, a natural language (NL)-based knowledge exploration system designed for exploring domain-specific knowledge graphs. It differs from conventional knowledge graph search tools in that it enables an interactive exploratory NL query over domain-specific knowledge graphs. GraphLingo seamlessly integrates graph query processing and large language models with a graph pattern-based prompt generation approach to guide users in exploring relevant factual knowledge. It streamlines NL-based question & answer, graph query optimization & refining, and automatic prompt generation. A unique feature of GraphLingo is its capability to enable users to explore by seamlessly switching between a more 'open' approach and a more relevant yet 'conservative' one, facilitated by diversified query suggestions. We show cases of GraphLingo in curriculum suggestion, and materials scientific data search. © 2024 IEEE. |
Mike
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3982
-
Le-Duc 2024
wav2graph: A Framework for Supervised Learning Knowledge Graph from Speech
arXiv 2024;(): 2024 Ref ID: 8519 Knowledge graphs (KGs) enhance the performance of large language models (LLMs) and search engines by providing structured, interconnected data that improves reasoning and context-awareness. However, KGs only focus on text data, thereby neglecting other modalities such as speech. In this work, we introduce wav2graph, the first framework for supervised learning knowledge graph from speech data. Our pipeline are straightforward: (1) constructing a KG based on transcribed spoken utterances and a named entity database, (2) converting KG into embedding vectors, and (3) training graph neural networks (GNNs) for node classification and link prediction tasks. Through extensive experiments conducted in inductive and transductive learning contexts using state-of-the-art GNN models, we provide baseline results and error analysis for node classification and link prediction tasks on human transcripts and automatic speech recognition (ASR) transcripts, including evaluations using both encoder-based and decoder-based node embeddings, as well as monolingual and multilingual acoustic pre-trained models. All related code, data, and models are published online. |
yuexi
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#450
-
Lee 2023
Knowledge Graph-based Genetic Fuzzy Agent for Human Intelligence and Machine Co-Learning
IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) 2023;(): Incheon, SOUTH KOREA Ieee 2023 DOI: 10.1109/fuzz52849.2023.10309699 · Ref ID: 3035 This paper proposes a novel approach for evaluating the co-learning performance of human intelligence ( HI) and machine intelligence (MI) using a Knowledge Graph-based genetic fuzzy agent. The agent utilizes a Knowledge Graph structure to represent a specific knowledge domain related to human learning and employs a genetic fuzzy learning mechanism to construct a personalized learning model. Human learners can engage in co-learning with machines using state-of-the-art AI tools such as the Meta AI S2ST Taiwanese-English language model and the OpenAI ChatGPT text model. The proposed approach was evaluated using human learning data from an undergraduate computer science course and a series of Taiwanese and English language translation experience activities. The experimental results indicate that the proposed approach can effectively enhance the co-learning process for both human and machine learners. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1936
-
Lee 2023
Temporal Knowledge Graph Forecasting Without Knowledge Using In-Context Learning
EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings 2023;():544-557 Association for Computational Linguistics (ACL) 2023 Ref ID: 5014 Temporal knowledge graph (TKG) forecasting benchmarks challenge models to predict future facts using knowledge of past facts. In this paper, we develop an approach to use in-context learning (ICL) with large language models (LLMs) for TKG forecasting. Our extensive evaluation compares diverse baselines, including both simple heuristics and state-of-the-art (SOTA) supervised models, against pre-trained LLMs across several popular benchmarks and experimental settings. We observe that naive LLMs perform on par with SOTA models, which employ carefully designed architectures and supervised training for the forecasting task, falling within the (-3.6%, +1.5%) Hits@1 margin relative to the median performance. To better understand the strengths of LLMs for forecasting, we explore different approaches for selecting historical facts, constructing prompts, controlling information propagation, and parsing outputs into a probability distribution. A surprising finding from our experiments is that LLM performance endures (±0.4% Hit@1) even when semantic information is removed by mapping entities/relations to arbitrary numbers, suggesting that prior semantic knowledge is unnecessary; rather, LLMs can leverage the symbolic patterns in the context to achieve such a strong performance. Our analysis also reveals that ICL enables LLMs to learn irregular patterns from the historical context, going beyond frequency and recency biases. ©2023 Association for Computational Linguistics. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3267
-
Lee 2024
Collaboratively adding new knowledge to an LLM
arXiv 2024;(): 2024 Ref ID: 8731 We address the question of how to successively add new knowledge to an LLM whilst retaining previously-added knowledge. We consider two settings, semi-cooperative and fully-cooperative. Overall, LoRA performs better in most cases than full-fine tuning of all parameters when both new knowledge acquisition and retention of old, including recent, knowledge are taken into account. In the semi-cooperative setting, where datasets are not available after training, MOE mixing, model merging, and LoRA-based orthogonal subspace sequential learning, using a small weight on the orthogonality term, perform well. In the fully-cooperative setting where datasets remain available, joint training and sequential training with replay are both effective approaches with LoRA training generally preferable to full fine-tuning. The codes needed to reproduce the results are provided in an open source repository. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2004
-
Levy 2023
Understanding Natural Language in Context
Proceedings International Conference on Automated Planning and Scheduling, ICAPS 2023;33():659-667 Association for the Advancement of Artificial Intelligence 2023 DOI: 10.1609/icaps.v33i1.27248 · Ref ID: 5225 Recent years have seen an increasing number of applications that have a natural language interface, either in the form of chatbots or via personal assistants such as Alexa (Amazon), Google Assistant, Siri (Apple), and Cortana (Microsoft). To use these applications, a basic dialog between the assistant and the human is required. While this kind of dialog exists today mainly within static robots that do not make any movement in the household space, the challenge of reasoning about the information conveyed by the environment increases significantly when dealing with robots that can move and manipulate objects in our home environment. In this paper, we focus on cognitive robots, which have some knowledge-based models of the world and operate by reasoning and planning with this model. Thus, when the robot and the human communicate, there is already some formalism they can use - the robot's knowledge representation formalism. In this paper we describe an approach for translating natural language directives into the robot's formalism, allowing much more complicated household tasks to be completed. We do so by combining off-the-shelf SoTA large language models, planning tools, and the robot knowledge of the state of the world and of its own model. This results in much more accurate interpretation of directives in natural language. Copyright © 2023, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#471
-
Li 2019
Knowledge-Driven Encode, Retrieve, Paraphrase for Medical Image Report Generation
33rd AAAI Conference on Artificial Intelligence / 31st Innovative Applications of Artificial Intelligence Conference / 9th AAAI Symposium on Educational Advances in Artificial Intelligence 2019;():6666-6673 Honolulu, HI Assoc Advancement Artificial Intelligence 2019 Ref ID: 3515 Generating long and semantic-coherent reports to describe medical images poses great challenges towards bridging visual and linguistic modalities, incorporating medical domain knowledge, and generating realistic and accurate descriptions. We propose a novel Knowledge-driven Encode, Retrieve, Paraphrase (KERP) approach which reconciles traditional knowledge- and retrieval-based methods with modern learning-based methods for accurate and robust medical report generation. Specifically, KERP decomposes medical report generation into explicit medical abnormality graph learning and subsequent natural language modeling. KERP first employs an Encode module that transforms visual features into a structured abnormality graph by incorporating prior medical knowledge; then a Retrieve module that retrieves text templates based on the detected abnormalities; and lastly, a Paraphrase module that rewrites the templates according to specific cases. The core of KERP is a proposed generic implementation unit-Graph Transformer (GTR) that dynamically transforms high-level semantics between graph-structured data of multiple domains such as knowledge graphs, images and sequences. Experiments show that the proposed approach generates structured and robust reports supported with accurate abnormality description and explainable attentive regions, achieving the state-of-the-art results on two medical report benchmarks, with the best medical abnormality and disease classification accuracy and improved human evaluation performance. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3524
-
Li 2024
Incorporating External Knowledge and Goal Guidance for LLM-based Conversational Recommender Systems
arXiv 2024;(): 2024 Ref ID: 8273 This paper aims to efficiently enable large language models (LLMs) to use external knowledge and goal guidance in conversational recommender system (CRS) tasks. Advanced LLMs (e.g., ChatGPT) are limited in domain-specific CRS tasks for 1) generating grounded responses with recommendation-oriented knowledge, or 2) proactively leading the conversations through different dialogue goals. In this work, we first analyze those limitations through a comprehensive evaluation, showing the necessity of external knowledge and goal guidance which contribute significantly to the recommendation accuracy and language quality. In light of this finding, we propose a novel ChatCRS framework to decompose the complex CRS task into several sub-tasks through the implementation of 1) a knowledge retrieval agent using a tool-augmented approach to reason over external Knowledge Bases and 2) a goal-planning agent for dialogue goal prediction. Experimental results on two multi-goal CRS datasets reveal that ChatCRS sets new state-of-the-art benchmarks, improving language quality of informativeness by 17% and proactivity by 27%, and achieving a tenfold enhancement in recommendation accuracy. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#831
-
Li 2020
Towards Medical Machine Reading Comprehension with Structural Knowledge and Plain Text
Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020;():1427-1438 Electr Network Assoc Computational Linguistics-Acl 2020 Ref ID: 3368 Machine reading comprehension (MRC) has achieved significant progress on the open domain in recent years, mainly due to large-scale pre-trained language models. However, it performs much worse in specific domains such as the medical field due to the lack of extensive training data and professional structural knowledge neglect. As an effort, we first collect a large scale medical multi-choice question dataset (more than 21k instances) for the National Licensed Pharmacist Examination in China. It is a challenging medical examination with a passing rate of less than 14.2% in 2018. Then we propose a novel reading comprehension model KMQA, which can fully exploit the structural medical knowledge (i.e., medical knowledge graph) and the reference medical plain text (i.e., text snippets retrieved from reference books). The experimental results indicate that the KMQA outperforms existing competitive models with a large margin and passes the exam with 61.8% accuracy rate on the test set. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#394
-
Li 2024
Joint extraction of Chinese medical entities and relations based on RoBERTa and single-module global pointer
BackgroundMost Chinese joint entity and relation extraction tasks in medicine involve numerous nested entities, overlapping relations, and other challenging extraction issues. In response to these problems, some traditional methods decompose the joint extraction task into multiple steps or multiple modules, resulting in local dependency in the meantime.MethodsTo alleviate this issue, we propose a joint extraction model of Chinese medical entities and relations based on RoBERTa and single-module global pointer, namely RSGP, which formulates joint extraction as a global pointer linking problem. Considering the uniqueness of Chinese language structure, we introduce the RoBERTa-wwm pre-trained language model at the encoding layer to obtain a better embedding representation. Then, we represent the input sentence as a third-order tensor and score each position in the tensor to prepare for the subsequent process of decoding the triples. In the end, we design a novel single-module global pointer decoding approach to alleviate the generation of redundant information. Specifically, we analyze the decoding process of single character entities individually, improving the time and space performance of RSGP to some extent.ResultsIn order to verify the effectiveness of our model in extracting Chinese medical entities and relations, we carry out the experiments on the public dataset, CMeIE. Experimental results show that RSGP performs significantly better on the joint extraction of Chinese medical entities and relations, and achieves state-of-the-art results compared with baseline models.ConclusionThe proposed RSGP can effectively extract entities and relations from Chinese medical texts and help to realize the structure of Chinese medical texts, so as to provide high-quality data support for the construction of Chinese medical knowledge graphs. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1570
-
Li 2023
Large Language Models with Controllable Working Memory
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2023;():1774-1793 Association for Computational Linguistics (ACL) 2023 Ref ID: 5209 Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP), partly owing to the massive amounts of world knowledge they memorize during pretraining. While many downstream applications provide the model with an informational context to aid its underlying task, how the model's world knowledge interacts with the factual information presented in the context remains under explored. As a desirable behavior, an LLM should give precedence to the context whenever it contains task-relevant information that conflicts with the model's memorized knowledge. This enables model predictions to be grounded in the context, which then facilitates updating specific model predictions without frequently retraining the model. By contrast, when the context is irrelevant to the task, the model should ignore it and fall back on its internal knowledge. In this paper, we undertake a first joint study of the aforementioned two properties, namely controllability and robustness, in the context of LLMs. We demonstrate that state-of-the-art T5 and PaLM models (both pretrained and finetuned) could exhibit low controllability and robustness that does not improve with increasing the model size. As a solution, we propose a simple yet effective method - knowledge aware finetuning (KAFT) - to strengthen both controllability and robustness by injecting counterfactual and irrelevant contexts to standard supervised datasets. Our comprehensive evaluation showcases the utility of KAFT across model architectures and sizes. © 2023 Association for Computational Linguistics. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1465
-
Li 2024
KEHRL: Learning Knowledge-Enhanced Language Representations with Hierarchical Reinforcement Learning
2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():9693-9704 European Language Resources Association (ELRA) 2024 Ref ID: 4510 Knowledge-enhanced pre-trained language models (KEPLMs) leverage relation triples from knowledge graphs (KGs) and integrate these external data sources into language models via self-supervised learning. Previous works treat knowledge enhancement as two independent operations, i.e., knowledge injection and knowledge integration. In this paper, we propose to learn Knowledge-Enhanced language representations with Hierarchical Reinforcement Learning (KEHRL), which jointly addresses the problems of detecting positions for knowledge injection and integrating external knowledge into the model in order to avoid injecting inaccurate or irrelevant knowledge. Specifically, a high-level reinforcement learning (RL) agent utilizes both internal and prior knowledge to iteratively detect essential positions in texts for knowledge injection, which filters out less meaningful entities to avoid diverting the knowledge learning direction. Once the entity positions are selected, a relevant triple filtration module is triggered to perform low-level RL to dynamically refine the triples associated with polysemic entities through binary-valued actions. Experiments validate KEHRL's effectiveness in probing factual knowledge and enhancing the model's performance on various natural language understanding tasks. © 2024 ELRA Language Resource Association: CC BY-NC 4.0. |
Mike
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#586
-
Li 2023
Multi-task Pre-training Language Model for Semantic Network Completion
ACM Trans. Asian Low-Resour. Lang. Inf. Process. 2023;22(11):19 2023 DOI: 10.1145/3627704 · Ref ID: 3105 Semantic networks, exemplified by the knowledge graph, serve as a means to represent knowledge by leveraging the structure of a graph. While the knowledge graph exhibits promising potential in the field of natural language processing, it suffers fromincompleteness. This article focuses on the task of completing knowledge graphs by predicting linkages between entities, which is fundamental yet critical. Traditional methods based on translational distance struggle when dealing with unseen entities. In contrast, semantic matching presents itself as a potential solution due to its ability to handle such cases. However, semantic matching-based approaches necessitate large-scale datasets for effective training, which are typically unavailable in practical scenarios, hindering their competitive performance. To address this challenge, we propose a novel architecture for knowledge graphs known as LP-BERT, which incorporates a language model. LP-BERT consists of two primary stages: multi-task pre-training and knowledge graph fine-tuning. During the pre-training phase, the model acquires relationship information from triples by predicting either entities or relations through three distinct tasks. In the fine-tuning phase, we introduce a batch-based triple-style negative sampling technique inspired by contrastive learning. This method significantly increases the proportion of negative sampling while maintaining a nearly unchanged training time. Furthermore, we propose a novel data augmentation approach that leverages the inverse relationship of triples to enhance both the performance and robustness of the model. To demonstrate the effectiveness of our proposed framework, we conduct extensive experiments on three widely used knowledge graph datasets: WN18RR, FB15k-237, and UMLS. The experimental results showcase the superiority of our methods, with LP-BERT achieving state-of-the-art performance on the WN18RR and FB15k-237 datasets. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3202
-
Li 2024
Automated Clinical Data Extraction with Knowledge Conditioned LLMs
arXiv 2024;(): 2024 Ref ID: 8424 The extraction of lung lesion information from clinical and medical imaging reports is crucial for research on and clinical care of lung-related diseases. Large language models (LLMs) can be effective at interpreting unstructured text in reports, but they often hallucinate due to a lack of domain-specific knowledge, leading to reduced accuracy and posing challenges for use in clinical settings. To address this, we propose a novel framework that aligns generated internal knowledge with external knowledge through in-context learning (ICL). Our framework employs a retriever to identify relevant units of internal or external knowledge and a grader to evaluate the truthfulness and helpfulness of the retrieved internal-knowledge rules, to align and update the knowledge bases. Our knowledge-conditioned approach also improves the accuracy and reliability of LLM outputs by addressing the extraction task in two stages: (i) lung lesion finding detection and primary structured field parsing, followed by (ii) further parsing of lesion description text into additional structured fields. Experiments with expert-curated test datasets demonstrate that this ICL approach can increase the F1 score for key fields (lesion size, margin and solidity) by an average of 12.9% over existing ICL methods. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1449
-
Li 2022
A Joint Extraction Strategy for Chinese Medical Text Based on Sequence Tagging
Proceedings - 2022 International Conference on Computer Engineering and Artificial Intelligence, ICCEAI 2022 2022;():6-10 Institute of Electrical and Electronics Engineers Inc. 2022 DOI: 10.1109/ICCEAI55464.2022.00011 · Ref ID: 5531 The research on entities and relations extraction in medical text is the basis of constructing medical knowledge graphs. Currently the mainstream pipelined extraction method do not consider the connection between entity recognition and relation classification, and could not address the problem of the overlapping relations among the triplets. This paper proposes a joint extraction strategy of entities and relations in chinese medical based on sequence tagging, which splits the joint extraction task into two sequence tagging subtasks, namely HE and TRE, establishing the connection of subtasks through shared encoding layer and semantic information of head entity. By incorporating the pre-Trained language model RoBERTa to obtain a richer numerical representations of word vectors, then fusing word vectors and part-of-speech vectors as inputs of word representation for joint extraction, in combination with the GRU-BiLSTM model to extract entities and relations directly. Experimental results show that this model achieves 54.44% F-value on the chinese medical dataset CMeIE, which outperforms the extraction performance of other pre-Trained language models. © 2022 IEEE. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1512
-
Li 2023
Knowledge graph representation learning model combining entity description and path information
Knowledge graph representation learning is a process of representing knowledge graph entities and relations in a multidimensional vector through specific rules. Existing representation learning methods are mostly used to solve the single-hop knowledge graph question-and-answer task, but their multi-hop reasoning ability cannot meet the actual demand. To improve the multi-hop reasoning ability, a knowledge graph representation learning model combining entity description and path information is proposed. First, the learning vector of entity and relation representation is obtained using the pre-training language model RoBERTa. Second, OPTransE is used to transform the knowledge graph into a vector integrating the path information of an ordered relation. Finally, the total energy function is constructed to fuse the vectors of entity description and path information. The feasibility and validity of the model are verified by comparing its performance in a link prediction task with that of the mainstream knowledge graph representation learning model. © 2023, Editorial Department of CAAI Transactions on Intelligent Systems. All rights reserved. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#277
-
Li 2021
Few-shot Knowledge Graph-to-Text Generation with Pretrained Language Models
Joint Conference of 59th Annual Meeting of the Association-for-Computational-Linguistics (ACL) / 11th International Joint Conference on Natural Language Processing (IJCNLP) / 6th Workshop on Representation Learning for NLP (RepL4NLP) 2021;():1558-1568 Electr Network Assoc Computational Linguistics-Acl 2021 Ref ID: 3026 This paper studies how to automatically generate a natural language text that describes the facts in knowledge graph (KG). Considering the few-shot setting, we leverage the excellent capacities of pretrained language models (PLMs) in language understanding and generation. We make three major technical contributions, namely representation alignment for bridging the semantic gap between KG encodings and PLMs, relation-biased KG linearization for deriving better input representations, and multi-task learning for learning the correspondence between KG and text. Extensive experiments on three benchmark datasets have demonstrated the effectiveness of our model on KG-to-text generation task. In particular, our model outperforms all comparison methods on both fully-supervised and few-shot settings. Our code and datasets are available at https://github.com/RUCAIBox/ Few-Shot-KG2Text. |
Mike
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1145
-
Li 2024
COSIGN: Contextual Facts Guided Generation for Knowledge Graph Completion
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2024 2024;1():1669-1682 Association for Computational Linguistics (ACL) 2024 Ref ID: 4480 Knowledge graph completion (KGC) aims to infer missing facts based on existing facts within a KG. Recently, research on generative models (GMs) has addressed the limitations of embedding methods in terms of generality and scalability. However, GM-based methods are sensitive to contextual facts on KG, so the contextual facts of poor quality can cause GMs to generate erroneous results. To improve the performance of GM-based methods for various KGC tasks, we propose a COntextual FactS GuIded GeneratioN (COSIGN) model. First, to enhance the inference ability of the generative model, we designed a contextual facts collector to achieve human-like retrieval behavior. Second, a contextual facts organizer is proposed to learn the organized capabilities of LLMs through knowledge distillation. Finally, the organized contextual facts as the input of the inference generator to generate missing facts. Experimental results demonstrate that COSIGN outperforms state-of-the-art baseline techniques in terms of performance. ©2024 Association for Computational Linguistics. |
Mike
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3659
-
Li 2024
LINKED: Eliciting, Filtering and Integrating Knowledge in Large Language Model for Commonsense Reasoning
arXiv 2024;(): 2024 Ref ID: 8697 Large language models (LLMs) sometimes demonstrate poor performance on knowledge-intensive tasks, commonsense reasoning is one of them. Researchers typically address these issues by retrieving related knowledge from knowledge graphs or employing self-enhancement methods to elicit knowledge in LLMs. However, noisy knowledge and invalid reasoning issues hamper their ability to answer questions accurately. To this end, we propose a novel method named eliciting, filtering and integrating knowledge in large language model (LINKED). In it, we design a reward model to filter out the noisy knowledge and take the marginal consistent reasoning module to reduce invalid reasoning. With our comprehensive experiments on two complex commonsense reasoning benchmarks, our method outperforms SOTA baselines (up to 9.0% improvement of accuracy). Besides, to measure the positive and negative impact of the injected knowledge, we propose a new metric called effectiveness-preservation score for the knowledge enhancement works. Finally, through extensive experiments, we conduct an in-depth analysis and find many meaningful conclusions about LLMs in commonsense reasoning tasks. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3577
-
Li 2024
Know the Unknown: An Uncertainty-Sensitive Method for LLM Instruction Tuning
arXiv 2024;(): 2024 Ref ID: 8383 Large language models (LLMs) have demonstrated remarkable capabilities but still face challenges such as hallucinations. One potential reason for hallucinations is the lack of relevant knowledge or context. Thus, a promising solution involves instructing LLMs to respond with "I do not know" when a question falls outside their knowledge domain or the provided context. However, in this work, we observed that LLMs struggle to admit their lack of knowledge, primarily due to existing instruction datasets designed to encourage specific answers. To improve models' capability to recognize the boundaries of their knowledge, we propose a novel approach called uncertainty-sensitive tuning. This method involves two-stage training designed for uncertainty recognition and prompt-sensitive activation. In the first stage, we guide the LLM to reject unknown questions. In the second stage, we force the model to follow the instructions by incorporating designed causal instructions. The experimental results demonstrate that our proposed uncertainty-sensitive tuning method enhance the model's ability to identify areas of uncertainty. Specifically, it achieves a substantial improvement of up to 34.7% in handling questions involving knowledge gaps compared to the original model. Moreover, our finetuned models even outperform GPT-4, exhibiting an overall performance improvement of up to 4.2%. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1532
-
Li 2022
A Knowledge-Enhanced Model with Dual-Channel Encoder for Joint Entity and Relation Extraction from Biomedical Literature
Proceedings - 2022 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2022 2022;():773-776 Institute of Electrical and Electronics Engineers Inc. 2022 DOI: 10.1109/BIBM55620.2022.9995158 · Ref ID: 5466 Biomedical entity and relation extraction has attracted increasing attention recently, whereas it remains challenging due to its domain-specific features for the biomedical corpus. Hence, many researchers consider utilizing external knowledge from large-scale databases to enhance the semantic understanding of models. However, these knowledge-enhanced methods usually enrich context information by incorporating the context-independent knowledge into entity representations and lack effective interaction. Actually, inspired by pre-trained language models, we argue that knowledge representations need to be trainable and adapted for different contexts. Therefore, we propose Knowledge-enhanced Dual-channel Iterative Model (KeDcIM), a novel end-to-end joint model for biomedical entity and relation extraction. Experiments show that KeDcIM achieves new state-of-the-art results on two benchmark datasets. © 2022 IEEE. |
Mike
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1610
-
Li 2024
LLM-based Multi-Level Knowledge Generation for Few-shot Knowledge Graph Completion
IJCAI International Joint Conference on Artificial Intelligence 2024;():2135-2143 International Joint Conferences on Artificial Intelligence 2024 Ref ID: 4384 Knowledge Graphs (KGs) are pivotal in various NLP applications but often grapple with incompleteness, especially due to the long-tail problem where infrequent, unpopular relationships drastically reduce the KG completion performance.In this paper, we focus on Few-shot Knowledge Graph Completion (FKGC), a task addressing these gaps in long-tail scenarios.Amidst the rapid evolution of Large Language Models, we propose a generation-based FKGC paradigm facilitated by LLM distillation.Our MuKDC framework employs multi-level knowledge distillation for few-shot KG completion, generating supplementary knowledge to mitigate data scarcity in few-shot environments.MuKDC comprises two primary components: Multi-level Knowledge Generation, which enriches the KG at various levels, and Consistency Assessment, to ensure the coherence and reliability of the generated knowledge.Most notably, our method achieves SOTA results in both FKGC and multi-modal FKGC benchmarks, significantly advancing KG completion and enhancing the understanding and application of LLMs in structured knowledge generation and assessment. © 2024 International Joint Conferences on Artificial Intelligence. All rights reserved. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#249
-
Li 2025
Explainable reasoning over temporal knowledge graphs by pre-trained language model
Temporal knowledge graph reasoning (TKGR) has been considered as a crucial task for modeling the evolving knowledge, aiming to infer the unknown connections between entities at specific times. Traditional TKGR methods try to aggregate structural information between entities and evolve representations of entities over distinct snapshots, while some other methods attempt to extract temporal logic rules from historical interactions. However, these methods fail to address the continuously emerging unseen entities over time and ignore the historical dependencies between entities and relations. To overcome these limitations, we propose a novel method, termed TPNet, which introduces historical information completion strategy (HICS) and pre-trained language model (PLM) to conduct explainable inductive reasoning over TKGs. Specifically, TPNet extracts reliable temporal logical paths from historical subgraphs using a temporal-correlated search strategy. For unseen entities, we utilize HICS to sample or generate paths to supplement their historical information. Besides, a PLM and a time-aware encoder are introduced to jointly encode the temporal paths, thereby comprehensively capturing dependencies between entities and relations. Moreover, the semantic similarity between the query quadruples and the extracted paths is evaluated to simultaneously optimize the representations of entities and relations. Extensive experiments on entity and relation prediction tasks are conducted to evaluate the performance of TPNet. The experimental results on four benchmark datasets demonstrate the superiority of TPNet over state-of-the-art TKGR methods, achieving improvements of 14.35%, 23.08%, 6.75% and 5.38% on MRR, respectively. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1360
-
Li 2023
Graph Reasoning for Question Answering with Triplet Retrieval
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2023;():3366-3375 Association for Computational Linguistics (ACL) 2023 Ref ID: 5119 Answering complex questions often requires reasoning over knowledge graphs (KGs). State-of-the-art methods often utilize entities in questions to retrieve local subgraphs, which are then fed into KG encoder, e.g. graph neural networks (GNNs), to model their local structures and integrated into language models for question answering. However, this paradigm constrains retrieved knowledge in local subgraphs and discards more diverse triplets buried in KGs that are disconnected but useful for question answering. In this paper, we propose a simple yet effective method to first retrieve the most relevant triplets from KGs and then rerank them, which are then concatenated with questions to be fed into language models. Extensive results on both CommonsenseQA and OpenbookQA datasets show that our method can outperform state-of-the-art up to 4.6% absolute accuracy. © 2023 Association for Computational Linguistics. |
Xinchen
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1419
-
Li 2022
Instilling Type Knowledge in Language Models via Multi-Task QA
Findings of the Association for Computational Linguistics: NAACL 2022 - Findings 2022;():594-603 Association for Computational Linguistics (ACL) 2022 Ref ID: 5589 Understanding human language often necessitates understanding entities and their place in a taxonomy of knowledge-their types. Previous methods to learn entity types rely on training classifiers on datasets with coarse, noisy, and incomplete labels. We introduce a method to instill fine-grained type knowledge in language models with text-to-text pre-training on type-centric questions leveraging knowledge base documents and knowledge graphs. We create the WikiWiki dataset: entities and passages from 10M Wikipedia articles linked to theWikidata knowledge graph with 41K types. Models trained on WikiWiki achieve state-ofthe- art performance in zero-shot dialog state tracking benchmarks, accurately infer entity types in Wikipedia articles, and can discover new types deemed useful by human judges. © Findings of the Association for Computational Linguistics: NAACL 2022 - Findings. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#449
-
Li 2023
Knowledge Graph-Based Credibility Evaluation Method for Electric Grid Large Language Model Knowledge Question-Answering
7th International Conference on Electronic Information Technology and Computer Engineering (EITCE) 2023;():754-759 Xiamen, PEOPLES R CHINA Assoc Computing Machinery 2023 DOI: 10.1145/3650400.3650526 · Ref ID: 2946 In the field of electricity, specialized terminology is often intricate and complex, making it challenging for non-experts to comprehend. However, with the advancement of artificial intelligence technology, the emergence of large language models provides a new technological solution to address this issue. Large language models, based on deep learning techniques, have the capability to quickly understand and interpret specialized terminology in the electricity domain through learning from a vast corpus of professional literature and data. They can then be applied to various domains, including question-answering systems. However, existing large language models still face issues of unreliable outputs, necessitating a method to evaluate their results and improve the quality of their applications. We propose a knowledge graph-based credibility evaluation method for electric grid large language model knowledge question-answering. This method aligns the answers generated by large language models with the knowledge graph of a local knowledge base and calculates their cosine similarity and Pearson correlation coefficient. We batch-process the answers from the large language model into an electricity dataset and validate them using this method. Experimental results demonstrate that this method can accurately and efficiently reflect the relevance between texts, providing a reliable scoring basis for question-answering by large models in vertical domains. Future research can focus on exploring other embedding methods that can better extract semantic relationships between texts and validating the feasibility of this method in vertical domains other than electricity. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1982
-
Li 2024
Towards Verifiable Generation: A Benchmark for Knowledge-aware Language Model Attribution
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():493-516 Association for Computational Linguistics (ACL) 2024 Ref ID: 4387 Although achieving great success, Large Language Models (LLMs) usually suffer from unreliable hallucinations. Although language attribution can be a potential solution, there are no suitable benchmarks and evaluation metrics to attribute LLMs to structured knowledge. In this paper, we define a new task of Knowledge-aware Language Model Attribution (KaLMA) that improves upon three core concerns with conventional attributed LMs. First, we extend attribution source from unstructured texts to Knowledge Graph (KG), whose rich structures benefit both the attribution performance and working scenarios. Second, we propose a new “Conscious Incompetence" setting considering the incomplete knowledge repository, where the model identifies the need for supporting knowledge beyond the provided KG. Third, we propose a comprehensive automatic evaluation metric encompassing text quality, citation quality, and text citation alignment. To implement the above innovations, we build a dataset in biography domain BioKaLMA via evolutionary question generation strategy, to control the question complexity and necessary knowledge to the answer. For evaluation, we develop a baseline solution and demonstrate the room for improvement in LLMs' citation generation, emphasizing the importance of incorporating the "Conscious Incompetence" setting, and the critical role of retrieval accuracy. © 2024 Association for Computational Linguistics. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#239
-
Li 2024
Evaluating Embeddings from Pre-Trained Language Models and Knowledge Graphs for Educational Content Recommendation
Educational content recommendation is a cornerstone of AI-enhanced learning. In particular, to facilitate navigating the diverse learning resources available on learning platforms, methods are needed for automatically linking learning materials, e.g., in order to recommend textbook content based on exercises. Such methods are typically based on semantic textual similarity (STS) and the use of embeddings for text representation. However, it remains unclear what types of embeddings should be used for this task. In this study, we carry out an extensive empirical evaluation of embeddings derived from three different types of models: (i) static embeddings trained using a concept-based knowledge graph, (ii) contextual embeddings from a pre-trained language model, and (iii) contextual embeddings from a large language model (LLM). In addition to evaluating the models individually, various ensembles are explored based on different strategies for combining two models in an early vs. late fusion fashion. The evaluation is carried out using digital textbooks in Swedish for three different subjects and two types of exercises. The results show that using contextual embeddings from an LLM leads to superior performance compared to the other models, and that there is no significant improvement when combining these with static embeddings trained using a knowledge graph. When using embeddings derived from a smaller language model, however, it helps to combine them with knowledge graph embeddings. The performance of the best-performing model is high for both types of exercises, resulting in a mean Recall@3 of 0.96 and 0.95 and a mean MRR of 0.87 and 0.86 for quizzes and study questions, respectively, demonstrating the feasibility of using STS based on text embeddings for educational content recommendation. The ability to link digital learning materials in an unsupervised manner-relying only on readily available pre-trained models-facilitates the development of AI-enhanced learning. |
Kwesi
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#657
-
Li 2024
PMET: Precise Model Editing in a Transformer
38th AAAI Conference on Artificial Intelligence (AAAI) / 36th Conference on Innovative Applications of Artificial Intelligence / 14th Symposium on Educational Advances in Artificial Intelligence 2024;():18564-18572 Vancouver, CANADA Assoc Advancement Artificial Intelligence 2024 Ref ID: 3773 Model editing techniques modify a minor proportion of knowledge in Large Language Models (LLMs) at a relatively low cost, which have demonstrated notable success. Existing methods assume Transformer Layer (TL) hidden states are values of key-value memories of the Feed-Forward Network (FFN). They usually optimize the TL hidden states to memorize target knowledge and use it to update the weights of the FFN in LLMs. However, the information flow of TL hidden states comes from three parts: Multi-Head Self-Attention (MHSA), FFN, and residual connections. Existing methods neglect the fact that the TL hidden states contains information not specifically required for FFN. Consequently, the performance of model editing decreases. To achieve more precise model editing, we analyze hidden states of MHSA and FFN, finding that MHSA encodes certain general knowledge extraction patterns. This implies that MHSA weights do not require updating when new knowledge is introduced. Based on above findings, we introduce PMET, which simultaneously optimizes Transformer Component (TC, namely MHSA and FFN) hidden states, while only using the optimized TC hidden states of FFN to precisely update FFN weights. Our experiments demonstrate that PMET exhibits state-of-the-art performance on both the COUNTERFACT and zsRE datasets. Our ablation experiments substantiate the effectiveness of our enhancements, further reinforcing the finding that the MHSA encodes certain general knowledge extraction patterns and indicating its storage of a small amount of factual knowledge. Our code is available at \url{https://github.com/xpq-tech/PMET}. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1522
-
Li 2024
A Knowledge Plug-and-Play Test Bed for Open-domain Dialogue Generation
2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():666-676 European Language Resources Association (ELRA) 2024 Ref ID: 4541 Knowledge-based, open-domain dialogue generation aims to build chit-chat systems that talk to humans using mined support knowledge. Many types and sources of knowledge have previously been shown to be useful as support knowledge. Even in the era of large language models, response generation grounded in knowledge retrieved from additional up-to-date sources remains a practically important approach. While prior work using single-source knowledge has shown a clear positive correlation between the performances of knowledge selection and response generation, there are no existing multi-source datasets for evaluating support knowledge retrieval. Further, prior work has assumed that the knowledge sources available at test time are the same as during training. This unrealistic assumption unnecessarily handicaps models, as new knowledge sources can become available after a model is trained. In this paper, we present a high-quality benchmark named multi-source Wizard of Wikipedia (Ms.WoW) for evaluating multi-source dialogue knowledge selection and response generation. Unlike existing datasets, it contains clean support knowledge, grounded at the utterance level and partitioned into multiple knowledge sources. We further propose a new challenge, dialogue knowledge plug-and-play, which aims to test an already trained dialogue model on using new support knowledge from previously unseen sources in a zero-shot fashion. © 2024 ELRA Language Resource Association: CC BY-NC 4.0. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3130
-
Li 2024
FinDKG: Dynamic Knowledge Graphs with Large Language Models for Detecting Global Trends in Financial Markets
Proceedings of the 5th ACM International Conference on AI in Finance 2024;():573–581 Brooklyn, NY, USA Association for Computing Machinery 2024 DOI: 10.1145/3677052.3698603 · Ref ID: 7244 |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3898
-
Li 2024
SWEA: Updating Factual Knowledge in Large Language Models via Subject Word Embedding Altering
arXiv 2024;(): 2024 Ref ID: 8058 The general capabilities of large language models (LLMs) make them the infrastructure for various AI applications, but updating their inner knowledge requires significant resources. Recent model editing is a promising technique for efficiently updating a small amount of knowledge of LLMs and has attracted much attention. In particular, local editing methods, which directly update model parameters, are more suitable for updating a small amount of knowledge. Local editing methods update weights by computing least squares closed-form solutions and identify edited knowledge by vector-level matching in inference, which achieve promising results. However, these methods still require a lot of time and resources to complete the computation. Moreover, vector-level matching lacks reliability, and such updates disrupt the original organization of the model's parameters. To address these issues, we propose an detachable and expandable Subject Word Embedding Altering (SWEA) framework, which finds the editing embeddings through token-level matching and adds them to the subject word embeddings in Transformer input. To get these editing embeddings, we propose optimizing then suppressing fusion method, which first optimizes learnable embedding vectors for the editing target and then suppresses the Knowledge Embedding Dimensions (KEDs) to obtain final editing embeddings. We thus propose SWEA$\oplus$OS method for editing factual knowledge in LLMs. We demonstrate the overall state-of-the-art (SOTA) performance of SWEA$\oplus$OS on the \textsc{CounterFact} and zsRE datasets. To further validate the reasoning ability of SWEA$\oplus$OS in editing knowledge, we evaluate it on the more complex \textsc{RippleEdits} benchmark. The results demonstrate that SWEA$\oplus$OS possesses SOTA reasoning ability. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3248
-
Li 2024
Can We Further Elicit Reasoning in LLMs? Critic-Guided Planning with Retrieval-Augmentation for Solving Challenging Tasks
arXiv 2024;(): 2024 Ref ID: 8649 State-of-the-art large language models (LLMs) exhibit impressive problem-solving capabilities but may struggle with complex reasoning and factual correctness. Existing methods harness the strengths of chain-of-thought and retrieval-augmented generation (RAG) to decompose a complex problem into simpler steps and apply retrieval to improve factual correctness. These methods work well on straightforward reasoning tasks but often falter on challenging tasks such as competitive programming and mathematics, due to frequent reasoning errors and irrelevant knowledge retrieval. To address this, we introduce Critic-guided planning with Retrieval-augmentation, CR-Planner, a novel framework that leverages fine-tuned critic models to guide both reasoning and retrieval processes through planning. CR-Planner solves a problem by iteratively selecting and executing sub-goals. Initially, it identifies the most promising sub-goal from reasoning, query generation, and retrieval, guided by rewards given by a critic model named sub-goal critic. It then executes this sub-goal through sampling and selecting the optimal output based on evaluations from another critic model named execution critic. This iterative process, informed by retrieved information and critic models, enables CR-Planner to effectively navigate the solution space towards the final answer. We employ Monte Carlo Tree Search to collect the data for training the critic models, allowing for a systematic exploration of action sequences and their long-term impacts. We validate CR-Planner on challenging domain-knowledge-intensive and reasoning-heavy tasks, including competitive programming, theorem-driven math reasoning, and complex domain retrieval problems. Our experiments demonstrate that CR-Planner significantly outperforms baselines, highlighting its effectiveness in addressing challenging problems by improving both reasoning and retrieval. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3624
-
Li 2024
Large Language Model Agent for Fake News Detection
arXiv 2024;(): 2024 Ref ID: 8271 In the current digital era, the rapid spread of misinformation on online platforms presents significant challenges to societal well-being, public trust, and democratic processes, influencing critical decision making and public opinion. To address these challenges, there is a growing need for automated fake news detection mechanisms. Pre-trained large language models (LLMs) have demonstrated exceptional capabilities across various natural language processing (NLP) tasks, prompting exploration into their potential for verifying news claims. Instead of employing LLMs in a non-agentic way, where LLMs generate responses based on direct prompts in a single shot, our work introduces FactAgent, an agentic approach of utilizing LLMs for fake news detection. FactAgent enables LLMs to emulate human expert behavior in verifying news claims without any model training, following a structured workflow. This workflow breaks down the complex task of news veracity checking into multiple sub-steps, where LLMs complete simple tasks using their internal knowledge or external tools. At the final step of the workflow, LLMs integrate all findings throughout the workflow to determine the news claim's veracity. Compared to manual human verification, FactAgent offers enhanced efficiency. Experimental studies demonstrate the effectiveness of FactAgent in verifying claims without the need for any training process. Moreover, FactAgent provides transparent explanations at each step of the workflow and during final decision-making, offering insights into the reasoning process of fake news detection for end users. FactAgent is highly adaptable, allowing for straightforward updates to its tools that LLMs can leverage within the workflow, as well as updates to the workflow itself using domain knowledge. This adaptability enables FactAgent's application to news verification across various domains. |
yuexi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1543
-
Li 2024
KnowPhish: Large Language Models Meet Multimodal Knowledge Graphs for Enhancing Reference-Based Phishing Detection
Proceedings of the 33rd USENIX Security Symposium 2024;():793-810 USENIX Association 2024 Ref ID: 4292 Phishing attacks have inflicted substantial losses on individuals and businesses alike, necessitating the development of robust and efficient automated phishing detection approaches. Reference-based phishing detectors (RBPDs), which compare the logos on a target webpage to a known set of logos, have emerged as the state-of-the-art approach. However, a major limitation of existing RBPDs is that they rely on a manually constructed brand knowledge base, making it infeasible to scale to a large number of brands, which results in false negative errors due to the insufficient brand coverage of the knowledge base. To address this issue, we propose an automated knowledge collection pipeline, using which we collect a large-scale multimodal brand knowledge base, KnowPhish, containing 20k brands with rich information about each brand. KnowPhish can be used to boost the performance of existing RBPDs in a plug-and-play manner. A second limitation of existing RBPDs is that they solely rely on the image modality, ignoring useful textual information present in the webpage HTML. To utilize this textual information, we propose a Large Language Model (LLM)-based approach to extract brand information of webpages from text. Our resulting multimodal phishing detection approach, KnowPhish Detector (KPD), can detect phishing webpages with or without logos. We evaluate KnowPhish and KPD on a manually validated dataset, and a field study under Singapore's local context, showing substantial improvements in effectiveness and efficiency compared to state-of-the-art baselines. © USENIX Security Symposium 2024.All rights reserved. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#473
-
Li 2022
Knowledge-Grounded Dialogue Generation with a Unified Knowledge Representation
Conference of the North-American-Chapter-of-the-Association-for-Computational-Linguistics (NAAACL) - Human Language Technologies 2022;():206-218 Seattle, WA Assoc Computational Linguistics-Acl 2022 Ref ID: 3316 Knowledge-grounded dialogue systems are challenging to build due to the lack of training data and heterogeneous knowledge sources. Existing systems perform poorly on unseen topics due to limited topics covered in the training data. In addition, it is challenging to generalize to the domains that require different types of knowledge sources. To address the above challenges, we present PLUG(1), a language model that homogenizes different knowledge sources to a unified knowledge representation for knowledge-grounded dialogue generation tasks. We first retrieve relevant information from heterogeneous knowledge sources (e.g., wiki, dictionary, or knowledge graph); Then the retrieved knowledge is transformed into text and concatenated with dialogue history to feed into the language model for generating responses. PLUG is pre-trained on a large-scale knowledge-grounded dialogue corpus. The empirical evaluation on two benchmarks shows that PLUG generalizes well across different knowledge-grounded dialogue tasks. It achieves comparable performance with state-of-the-art methods in the fully-supervised setting and significantly outperforms other approaches in zero-shot and few-shot settings. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#77
-
Li 2024
Building a knowledge graph to enrich ChatGPT responses in manufacturing service discovery
Sourcing and identification of new manufacturing partners is crucial for manufacturing system integrators to enhance agility and reduce risk through supply chain diversification in the global economy. The advent of advanced large language models has captured significant interest, due to their ability to generate comprehensive and articulate responses across a wide range of knowledge domains. However, the system often falls short in accuracy and completeness when responding to domain-specific inquiries, particularly in areas like manufacturing service discovery. This research explores the potential of leveraging Knowledge Graphs in conjunction with ChatGPT to streamline the process for prospective clients in identifying small manufacturing enterprises. In this study, we propose a method that integrates bottom-up ontology with advanced machine learning models to develop a Manufacturing Service Knowledge Graph from an array of structured and unstructured data sources, including the digital footprints of small-scale manufacturers throughout North America. The Knowledge Graph and the learned graph embedding vectors are leveraged to tackle intricate queries within the digital supply chain network, responding with enhanced reliability and greater interpretability. The approach highlighted is scalable to millions of entities that can be distributed to form a global Manufacturing Service Knowledge Network Graph that can potentially interconnect multiple types of Knowledge Graphs that span industry sectors, geopolitical boundaries, and business domains. The dataset developed for this study, now publicly accessible, encompasses more than 13,000 manufacturers' weblinks, manufacturing services, certifications, and location entity types. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1247
-
Li 2024
An Enhanced Prompt-Based LLM Reasoning Scheme via Knowledge Graph-Integrated Collaboration
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2024;15020 LNCS():251-265 Springer Science and Business Media Deutschland GmbH 2024 DOI: 10.1007/978-3-031-72344-5_17 · Ref ID: 4139 While Large Language Models (LLMs) demonstrate exceptional performance in a multitude of Natural Language Processing (NLP) tasks, they encounter challenges in practical applications, including issues with hallucinations, inadequate knowledge updating, and limited transparency in the reasoning process. To overcome these limitations, this study innovatively proposes a collaborative training-free reasoning scheme involving tight cooperation between Knowledge Graph (KG) and LLMs. This scheme first involves using LLMs to iteratively explore KG, selectively retrieving a task-relevant knowledge subgraph to support reasoning. The LLMs are then guided to further combine inherent implicit knowledge to reason on the subgraph while explicitly elucidating the reasoning process. Through such a cooperative approach, our scheme achieves more reliable knowledge-based reasoning and facilitates the tracing of the reasoning results. Experimental results show that our scheme significantly progressed across multiple datasets, notably achieving an improvement of over 10% on the QALD10 dataset compared to both the best baseline and the fine-tuned state-of-the-art (SOTA) models. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#814
-
Li 2024
Text-enhanced knowledge graph representation learning with local structure
Knowledge graph representation learning entails transforming entities and relationships within a knowledge graph into vectors to enhance downstream tasks. The rise of pre -trained language models has recently promoted text -based approaches for knowledge graph representation learning. However, these methods often need more structural information on knowledge graphs, prompting the challenge of integrating graph structure knowledge into text -based methodologies. To tackle this issue, we introduce a text -enhanced model with local structure (TEGS) that embeds local graph structure details from the knowledge graph into the text encoder. TEGS integrates k -hop neighbor entity information into the text encoder and employs a decoupled attention mechanism to blend relative position encoding and text semantics. This strategy augments learnable content through graph structure information and mitigates the impact of semantic ambiguity via the decoupled attention mechanism. Experimental findings demonstrate TEGS's effectiveness at fusing graph structure information, resulting in state-ofthe-art performance across three datasets in link prediction tasks. In terms of Hit@1, when compared to the previous text -based models, our model demonstrated improvements of 2.1% on WN18RR, 2.4% on FB15k-237, and 2.7% on the NELL-One dataset. Our code is made publicly available on |
Mike
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#583
-
Li 2023
Multi-Hop Question Generation with Knowledge Graph-Enhanced Language Model
The task of multi-hop question generation (QG) seeks to generate questions that require a complex reasoning process that spans multiple sentences and answers. Beyond the conventional challenges of what to ask and how to ask, multi-hop QG necessitates sophisticated reasoning from dispersed evidence across multiple sentences. To address these challenges, a knowledge graph-enhanced language model (KGEL) has been developed to imitate human reasoning for multi-hop questions.The initial step in KGEL involves encoding the input sentence with a pre-trained GPT-2 language model to obtain a comprehensive semantic context representation. Next, a knowledge graph is constructed using the entities identified within the context. The critical information in the graph that is related to the answer is then utilized to update the context representations through an answer-aware graph attention network (GAT). Finally, the multi-head attention generation module (MHAG) is performed over the updated latent representations of the context to generate coherent questions. Human evaluations demonstrate that KGEL generates more logical and fluent multi-hop questions compared to GPT-2. Furthermore, KGEL outperforms five prominent baselines in automatic evaluations, with a BLEU-4 score that is 27% higher than that of GPT-2. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#229
-
Li 2024
Ensemble pretrained language models to extract biomedical knowledge from literature
Objectives The rapid expansion of biomedical literature necessitates automated techniques to discern relationships between biomedical concepts from extensive free text. Such techniques facilitate the development of detailed knowledge bases and highlight research deficiencies. The LitCoin Natural Language Processing (NLP) challenge, organized by the National Center for Advancing Translational Science, aims to evaluate such potential and provides a manually annotated corpus for methodology development and benchmarking.Materials and Methods For the named entity recognition (NER) task, we utilized ensemble learning to merge predictions from three domain-specific models, namely BioBERT, PubMedBERT, and BioM-ELECTRA, devised a rule-driven detection method for cell line and taxonomy names and annotated 70 more abstracts as additional corpus. We further finetuned the T0pp model, with 11 billion parameters, to boost the performance on relation extraction and leveraged entites' location information (eg, title, background) to enhance novelty prediction performance in relation extraction (RE).Results Our pioneering NLP system designed for this challenge secured first place in Phase I-NER and second place in Phase II-relation extraction and novelty prediction, outpacing over 200 teams. We tested OpenAI ChatGPT 3.5 and ChatGPT 4 in a Zero-Shot setting using the same test set, revealing that our finetuned model considerably surpasses these broad-spectrum large language models.Discussion and Conclusion Our outcomes depict a robust NLP system excelling in NER and RE across various biomedical entities, emphasizing that task-specific models remain superior to generic large ones. Such insights are valuable for endeavors like knowledge graph development and hypothesis formulation in biomedical research. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1585
-
Liang 2023
Less is More: Task-aware Layer-wise Distillation for Language Model Compression
Proceedings of Machine Learning Research 2023;202():20852-20867 ML Research Press 2023 Ref ID: 5221 Layer-wise distillation is a powerful tool to compress large models (i.e. teacher models) into small ones (i.e., student models). The student distills knowledge from the teacher by mimicking the hidden representations of the teacher at every intermediate layer. However, layer-wise distillation is difficult. Since the student has a smaller model capacity than the teacher, it is often under-fitted. Furthermore, the hidden representations of the teacher contain redundant information that the student does not necessarily need for the target task's learning. To address these challenges, we propose a novel Task-aware layEr-wise Distillation (TED). TED designs task-aware filters to align the hidden representations of the student and the teacher at each layer. The filters select the knowledge that is useful for the target task from the hidden representations. As such, TED reduces the knowledge gap between the two models and helps the student to fit better on the target task. We evaluate TED in two scenarios: continual pre-training and fine-tuning. TED demonstrates significant and consistent improvements over existing distillation methods in both scenarios. Code is available at https://github.com/cliang1453/task-aware-distillation. © 2023 Proceedings of Machine Learning Research. All rights reserved. |
Mike
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1374
-
Liang 2023
Hi-ArG: Exploring the Integration of Hierarchical Argumentation Graphs in Language Pretraining
EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings 2023;():14606-14620 Association for Computational Linguistics (ACL) 2023 Ref ID: 4955 The knowledge graph is a structure to store and represent knowledge, and recent studies have discussed its capability to assist language models for various applications. Some variations of knowledge graphs aim to record arguments and their relations for computational argumentation tasks. However, many must simplify semantic types to fit specific schemas, thus losing flexibility and expression ability. In this paper, we propose the Hierarchical Argumentation Graph (Hi-ArG), a new structure to organize arguments. We also introduce two approaches to exploit Hi-ArG, including a text-graph multi-modal model GreaseArG and a new pre-training framework augmented with graph information. Experiments on two argumentation tasks have shown that after further pre-training and fine-tuning, GreaseArG supersedes same-scale language models on these tasks, while incorporating graph information during further pre-training can also improve the performance of vanilla language models. Code for this paper is available at https://github.com/ljcleo/Hi-ArG. ©2023 Association for Computational Linguistics. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3562
-
Liang 2024
KAG: Boosting LLMs in Professional Domains via Knowledge Augmented Generation
arXiv 2024;(): 2024 Ref ID: 8614 The recently developed retrieval-augmented generation (RAG) technology has enabled the efficient construction of domain-specific applications. However, it also has limitations, including the gap between vector similarity and the relevance of knowledge reasoning, as well as insensitivity to knowledge logic, such as numerical values, temporal relations, expert rules, and others, which hinder the effectiveness of professional knowledge services. In this work, we introduce a professional domain knowledge service framework called Knowledge Augmented Generation (KAG). KAG is designed to address the aforementioned challenges with the motivation of making full use of the advantages of knowledge graph(KG) and vector retrieval, and to improve generation and reasoning performance by bidirectionally enhancing large language models (LLMs) and KGs through five key aspects: (1) LLM-friendly knowledge representation, (2) mutual-indexing between knowledge graphs and original chunks, (3) logical-form-guided hybrid reasoning engine, (4) knowledge alignment with semantic reasoning, and (5) model capability enhancement for KAG. We compared KAG with existing RAG methods in multihop question answering and found that it significantly outperforms state-of-theart methods, achieving a relative improvement of 19.6% on 2wiki and 33.5% on hotpotQA in terms of F1 score. We have successfully applied KAG to two professional knowledge Q&A tasks of Ant Group, including E-Government Q&A and E-Health Q&A, achieving significant improvement in professionalism compared to RAG methods. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3240
-
Liang 2023
C5: Towards Better Conversation Comprehension and Contextual Continuity for ChatGPT
arXiv 2023;(): 2023 Ref ID: 7799 Large language models (LLMs), such as ChatGPT, have demonstrated outstanding performance in various fields, particularly in natural language understanding and generation tasks. In complex application scenarios, users tend to engage in multi-turn conversations with ChatGPT to keep contextual information and obtain comprehensive responses. However, human forgetting and model contextual forgetting remain prominent issues in multi-turn conversation scenarios, which challenge the users' conversation comprehension and contextual continuity for ChatGPT. To address these challenges, we propose an interactive conversation visualization system called C5, which includes Global View, Topic View, and Context-associated Q&A View. The Global View uses the GitLog diagram metaphor to represent the conversation structure, presenting the trend of conversation evolution and supporting the exploration of locally salient features. The Topic View is designed to display all the question and answer nodes and their relationships within a topic using the structure of a knowledge graph, thereby display the relevance and evolution of conversations. The Context-associated Q&A View consists of three linked views, which allow users to explore individual conversations deeply while providing specific contextual information when posing questions. The usefulness and effectiveness of C5 were evaluated through a case study and a user study. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1581
-
Liang 2024
Learning to Trust Your Feelings: Leveraging Self-awareness in LLMs for Hallucination Mitigation
KnowledgeNLP 2024 - 3rd Workshop on Knowledge Augmented Methods for NLP, Proceedings of the Workshop 2024;():44-58 Association for Computational Linguistics (ACL) 2024 Ref ID: 4283 We evaluate the ability of Large Language Models (LLMs) to discern and express their internal knowledge state, a key factor in countering factual hallucination and ensuring reliable application of LLMs. We observe a robust self-awareness of internal knowledge state in LLMs, evidenced by over 85% accuracy in knowledge state probing. However, LLMs often fail to faithfully express their internal knowledge during generation, leading to factual hallucinations. We develop an automated hallucination annotation tool, DreamCatcher, which merges knowledge probing and consistency checking methods to rank factual preference data. Using knowledge preference as reward, We propose a Reinforcement Learning from Knowledge Feedback (RLKF) training framework, leveraging reinforcement learning to enhance the factuality and honesty of LLMs. Our experiments across multiple models show that RLKF training effectively enhances the ability of models to utilize their internal knowledge state, boosting performance in a variety of knowledge-based and honesty-related tasks. © 2024 Association for Computational Linguistics. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#20
-
Ligabue 2024
Applying a Context-based Method to Build a Knowledge Graph for the Blue Amazon
Knowledge graphs are employed in several tasks, such as question answering and recommendation systems, due to their ability to represent relationships between concepts. Automatically constructing such a graphs, however, remains an unresolved challenge within knowledge representation. To tackle this challenge, we propose CtxKG, a method specifically aimed at extracting knowledge graphs in a context of limited resources in which the only input is a set of unstructured text documents. CtxKG is based on OpenIE (a relationship triple extraction method) and BERT (a language model) and contains four stages: the extraction of relationship triples directly from text; the identification of synonyms across triples; the merging of similar entities; and the building of bridges between knowledge graphs of different documents. Our method distinguishes itself from those in the current literature (i) through its use of the parse tree to avoid the overlapping entities produced by base implementations of OpenIE; and (ii) through its bridges, which create a connected network of graphs, overcoming a limitation similar methods have of one isolated graph per document. We compare our method to two others by generating graphs for movie articles from Wikipedia and contrasting them with benchmark graphs built from the OMDb movie database. Our results suggest that our method is able to improve multiple aspects of knowledge graph construction. They also highlight the critical role that triple identification and named-entity recognition have in improving the quality of automatically generated graphs, suggesting future paths for investigation. Finally, we apply CtxKG to build BlabKG, a knowledge graph for the Blue Amazon, and discuss possible improvements. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1663
-
Lim 2024
Multilingual Question Answering for Malaysia History with Transformer-based Language Model
In natural language processing (NLP), a Question Answering System (QAS) refers to a system or model that is designed to understand and respond to user queries in natural language. As we navigate through the recent advancements in QAS, it can be observed that there is a paradigm shift of the methods used from traditional machine learning and deep learning approaches towards transformer-based language models. While significant progress has been made, the utilization of these models for historical QAS and the development of QAS for Malay language remain largely unexplored. This research aims to bridge the gaps, focusing on developing a Multilingual QAS for history of Malaysia by utilizing a transformer-based language model. The system development process encompasses various stages, including data collection, knowledge representation, data loading and pre-processing, document indexing and storing, and the establishment of a querying pipeline with the retriever and reader. A dataset with a collection of 100 articles, including web blogs related to the history of Malaysia, has been constructed, serving as the knowledge base for the proposed QAS. A significant aspect of this research is the use of the translated dataset in English instead of the raw dataset in Malay. This decision was made to leverage the effectiveness of well-established retriever and reader models that were trained on English data. Moreover, an evaluation dataset comprising 100 question-answer pairs has been created to evaluate the performance of the models. A comparative analysis of six different transformer-based language models, namely DeBERTaV3, BERT, ALBERT, ELECTRA, MiniLM, and RoBERTa, has been conducted, where the effectiveness of the models was examined through a series of experiments to determine the best reader model for the proposed QAS. The experimental results reveal that the proposed QAS achieved the best performance when employing RoBERTa as the reader model. Finally, the proposed QAS was deployed on Discord and equipped with multilingual support through the incorporation of language detection and translation modules, enabling it to handle queries in both Malay and English. © 2024, Ital Publication. All rights reserved. |
Davis
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#430
-
Lin 2023
Knowledge Graph Completion for Power Grid Main Equipment Using Pretrained Language Models
19th International Conference on Advanced Intelligent Computing Technology and Applications (ICIC) 2023;14089():828-838 Zhengzhou, PEOPLES R CHINA Springer-Verlag Singapore Pte Ltd 2023 DOI: 10.1007/978-981-99-4752-2_68 · Ref ID: 2938 The safe and stable operation of power systems relies on the timely diagnosis of defects in power grid equipment. To achieve this, knowledge graph (KG) can be used to model power grid equipment defect knowledge, and knowledge graph embedding (KGE) can be utilized to embed KG into low dimensional vector spaces for deep learning models. However, pre-trained language model-based KGE methods may not perform as well as structure-based methods due to their limitations in explicitly representing domain-specific knowledge and supplementary information about entities. In this study, a hybrid KGE model called PLMSM was proposed to address this issue. PLMSM combines pre-trained language models with structure-based models to input entities and their supplementary information into a pre-trained language model to obtain their embeddings, which are then combined with the embeddings generated by a structure-based model for entity completion tasks. The model was optimized through efficient negative sampling and addressed the issue of inaccurate predictions caused by long-tail entities in the power grid defects KG. The experimental results showed that PLMSM achieved good performance in Entity completion tasks on the power grid equipment defects KG. This proposed model has potential applications in power grid equipment defect diagnosis and maintenance. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2063
-
Lin 2024
Entity recognition of railway signal equipment fault information based on RoBERTa-wwm and deep learning integration
The operation and maintenance of railway signal systems create a significant and complex quantity of text data about faults. Aiming at the problems of fuzzy entity boundaries and low accuracy of entity recognition in the field of railway signal equipment faults, this paper provides a method for entity recognition of railway signal equipment fault information based on RoBERTa-wwm and deep learning integration. First, the model utilizes the RoBERTa-wwm pretrained language model to get the word vector of text sequences. Second, a parallel network consisting of a BiLSTM and a CNN is constructed to obtain the context feature information and the local attention information, respectively. Third, the feature vectors output from BiLSTM and CNN are combined and fed into MHA, focusing on extracting key feature information and mining the connection between different features. Finally, the label sequences with constraint relationships are outputted in CRF to complete the entity recognition task. The experimental analysis is carried out with fault text of railway signal equipment in the past ten years, and the experimental results show that the model has a higher evaluation index compared with the traditional model on this dataset, in which the precision, recall and F(1) value are 93.25%, 92.45%, and 92.85%, respectively. |
Mike
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3955
-
Lin 2024
Unleashing the Power of LLMs as Multi-Modal Encoders for Text and Graph-Structured Data
arXiv 2024;(): 2024 Ref ID: 8706 Graph-structured information offers rich contextual information that can enhance language models by providing structured relationships and hierarchies, leading to more expressive embeddings for various applications such as retrieval, question answering, and classification. However, existing methods for integrating graph and text embeddings, often based on Multi-layer Perceptrons (MLPs) or shallow transformers, are limited in their ability to fully exploit the heterogeneous nature of these modalities. To overcome this, we propose Janus, a simple yet effective framework that leverages Large Language Models (LLMs) to jointly encode text and graph data. Specifically, Janus employs an MLP adapter to project graph embeddings into the same space as text embeddings, allowing the LLM to process both modalities jointly. Unlike prior work, we also introduce contrastive learning to align the graph and text spaces more effectively, thereby improving the quality of learned joint embeddings. Empirical results across six datasets spanning three tasks, knowledge graph-contextualized question answering, graph-text pair classification, and retrieval, demonstrate that Janus consistently outperforms existing baselines, achieving significant improvements across multiple datasets, with gains of up to 11.4% in QA tasks. These results highlight Janus's effectiveness in integrating graph and text data. Ablation studies further validate the effectiveness of our method. |
Ishan
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1896
-
Lin 2023
Spatial Commonsense Reasoning for Machine Reading Comprehension
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2023;14177 LNAI():347-361 Springer Science and Business Media Deutschland GmbH 2023 DOI: 10.1007/978-3-031-46664-9_24 · Ref ID: 5163 This paper studies the problem of spatial commonsense reasoning for the machine reading comprehension task. Spatial commonsense is the human-shared but latent knowledge of object shape, size, distance, and position. Reasoning this abstract knowledge can facilitate machines better perceive their surroundings, which is crucial for general intelligence. However, this valuable topic is challenging and has been less studied. To bridge this research gap, we focus on this topic and propose a new method to realize spatial reasoning. Given a text, we first build a potential reasoning graph based on its parsing tree. To better support spatial reasoning, we retrieve the related commonsense entities and relations from external knowledge sources, including the pre-trained language model (LM) and knowledge graph (KG). LM covers all kinds of factual knowledge and KG has abundant commonsense relations. We then propose a new fusion method called LEGRN (LM Edge-GNN Reasoner Networks) to fuse the text and graph. LEGRN adopts layer-based attention to integrate the LM text encoder and KG graph encoder, which can capture correlations between LM text context and KG graph structure. Considering that spatial relations involve a variety of attributes, we propose an attribute-aware inferential network to deduce the correct answers. To evaluate our approach, we construct a new large-scale dataset named CRCSpatial, consisting of 40k spatial reasoning questions. Experiment results illustrated the effectiveness of our approach. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#296
-
Lin 2023
Fusing topology contexts and logical rules in language models for knowledge graph completion
Knowledge graph completion (KGC) aims to infer missing facts based on the observed ones, which is significant for many downstream applications. Given the success of deep learning and pre-trained language models (LMs), some LM-based methods are proposed for the KGC task. However, most of them focus on modeling the text of fact triples and ignore the deeper semantic information (e.g., topology contexts and logical rules) that is significant for KG modeling. For such a reason, we propose a unified framework FTL-LM to Fuse Topology contexts and Logical rules in Language Models for KGC, which mainly contains a novel path-based method for topology contexts learning and a variational expectation-maximization (EM) algorithm for soft logical rule distilling. The former utilizes a heterogeneous random-walk to generate topology paths and further reasoning paths that can represent topology contexts implicitly and can be modeled by a LM explicitly. The strategies of mask language modeling and contrastive path learning are introduced to model these topology contexts. The latter implicitly fuses logical rules by a variational EM algorithm with two LMs. Specifically, in the E-step, the triple LM is updated under the supervision of observed triples and valid hidden triples verified by the fixed rule LM. And in the M-step, we fix the triple LM and fine-tune the rule LM to update logical rules. Experiments on three common KGC datasets demonstrate the superiority of the proposed FTL-LM, e.g., it achieves 2.1% and 3.1% Hits@10 improvement over the state-of-the-art LM-based model LP-BERT in the WN18RR and FB15k-237, respectively. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1538
-
Lin 2024
A Knowledge-Injected Curriculum Pretraining Framework for Question Answering
WWW 2024 - Proceedings of the ACM Web Conference 2024;():1986-1997 Association for Computing Machinery, Inc 2024 DOI: 10.1145/3589334.3645406 · Ref ID: 4089 Knowledge-based question answering (KBQA) is a key task in natural language processing research, and also an approach to access the web data and knowledge, which requires exploiting knowledge graphs (KGs) for reasoning. In the literature, one promising solution for KBQA is to incorporate the pretrained language model (LM) with KGs by generating KG-centered pretraining corpus, which has shown its superiority. However, these methods often depend on specific techniques and resources to work, which may not always be available and restrict its application. Moreover, existing methods focus more on improving language understanding with KGs, while neglect the more important human-like complex reasoning. To this end, in this paper, we propose a general K nowledge-I njected C urriculum P retraining framework (KICP) to achieve comprehensive KG learning and exploitation for KBQA tasks, which is composed of knowledge injection (KI), knowledge adaptation (KA) and curriculum reasoning (CR). Specifically, the KI module first injects knowledge into the LM by generating KG-centered pretraining corpus, and generalizes the process into three key steps that could work with different implementations for flexible application. Next, the KA module learns knowledge from the generated corpus with LM equipped with an adapter as well as keeps its original natural language understanding ability to reduce the negative impacts of the difference between the generated and natural corpus. Last, to enable the LM with complex reasoning, the CR module follows human reasoning patterns to construct three corpora with increasing difficulties of reasoning, and further trains the LM from easy to hard in a curriculum manner to promote model learning. We provide an implementation of the general framework, and evaluate the proposed KICP on four real-word datasets. The results demonstrate that our framework can achieve higher performances, and have good generalization ability to other QA tasks. © 2024 ACM. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#565
-
Ling 2022
MetaGNN-Based Medical Records Unstructured Specialized Vocabulary Few-Shot Representation Learning
With the continuous breakthroughs in artificial intelligence technology, it has become easier to extract general-purpose knowledge using machine learning, but it is a challenging task to extract and learn small samples of knowledge in medical expertise. On the one hand, it is difficult to represent medical expertise entities, and on the other hand, the training samples of such expertise are small, and deep learning methods often require a large number of samples to complete the learning task. To this end, we proposes a graph network learning method for specialized vocabulary representation. Specifically, a contextual knowledge representation model based on graph meta-learning is proposed, which combines text, phrase, vocabulary, and other information to solve the problem of sparse data of medical electronic medical record entities that cannot be extracted and learned. In this method, a text-independent lexical representation learning method, a context-aware graph neural network, and a combined LSTM language model are used to model information from different perspectives as a way to learn semantic representations of professional discourse entities. The experimental results show that the accuracy of the method outperforms other similar methods and proves its effectiveness. |
Ishan
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2985
-
Lipaczewski 2013
Teaching and Training Formal Methods for Safety Critical Systems
2013 39th Euromicro Conference on Software Engineering and Advanced Applications 2013;():408-413 2013 DOI: 10.1109/SEAA.2013.54 · Ref ID: 6574 Embedded systems become a major part in many domains. This also involves systems which might create heavy damages and injuries when they fail. However, because of the rising number of software components used within this embedded hardware, safety-related problems are hard to discover, and it is even harder to prove that there are none. One approach to guarantee the correctness of a system is model-based safety analysis. They rely on an abstract representation of the system which can then be analyzed using model checkers. The results of these analysis are in general much more precise and often reveal surprising results of failure combinations, where no one had ever thought about before. Nevertheless model-based safety analysis is not used widely. Mainly because it is not well-known and hard to apply to current safety standards which rely on manual approaches. Another fact might be, that most approaches are scientific and in most cases prototypes that are hard to use. In this paper we present some ideas and first steps towards an easy to learn and easy to use model based safety approach. Additionally we present different user-interfaces that are supposed to support the user in his learning. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1252
-
Lippolis 2023
Enhancing Entity Alignment Between Wikidata and ArtGraph Using LLMs
CEUR Workshop Proceedings 2023;3540(): CEUR-WS 2023 Ref ID: 5051 Knowledge graphs (KGs) are used in a wide variety of applications, including within the cultural heritage domain. An important prerequisite of such applications is the quality and completeness of the data. Using a single KG might not be enough to fulfill this requirement. The absence of connections between KGs complicates taking advantage of the complementary data they can provide. This paper focuses on the Wikidata and A rtG raph KGs, which exhibit gaps in content that can be filled by enriching one with data from the other. Entity alignment can help to combine data from KGs by connecting entities that refer to the same real-world entities. However, entity alignment in art-domain knowledge graphs remains under-explored. In the pursuit of entity alignment between A rtG raph and Wikidata, a hybrid approach is proposed. The first part, which we call WES (Wikidata Entity Search), utilizes traditional Wikidata SPARQL queries and is followed by a supplementary sequence-to-sequence large language model (LLM) pipeline that we denote as pArtLink. The combined approach successfully aligned artworks and artists, with WES identifying entities for 14,982 artworks and 2,029 artists, and pArtLink further aligning 76 additional artists, thus enhancing the alignment process beyond WES’ capabilities. © 2023 Copyright for this paper by its authors. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#322
-
Lissandrini 2020
Graph-Query Suggestions for Knowledge Graph Exploration
29th Web Conference (WWW) 2020;():2549-2555 Taipei, TAIWAN Assoc Computing Machinery 2020 DOI: 10.1145/3366423.3380005 · Ref ID: 2988 We consider the task of exploratory search through graph queries on knowledge graphs. We propose to assist the user by expanding the query with intuitive suggestions to provide a more informative (full) query that can retrieve more detailed and relevant answers. To achieve this result, we propose a model that can bridge graph search paradigms with well-established techniques for information-retrieval. Our approach does not require any additional knowledge from the user and builds on principled language modelling approaches. We empirically show the effectiveness and efficiency of our approach on a large knowledge graph and how our suggestions are able to help build more complete and informative queries. |
Davis
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#71
-
Liu 2024
Bootstrapping Large Language Models for Radiology Report Generation
38th AAAI Conference on Artificial Intelligence (AAAI) / 36th Conference on Innovative Applications of Artificial Intelligence / 14th Symposium on Educational Advances in Artificial Intelligence 2024;():18635-18643 Vancouver, CANADA Assoc Advancement Artificial Intelligence 2024 Ref ID: 3502 Radiology report generation (RRG) aims to automatically generate a free-text description from a specific clinical radiograph, e.g., chest X-Ray images. Existing approaches tend to perform RRG with specific models trained on the public yet limited data from scratch, where they often lead to inferior performance owing to the problem of inefficient capabilities in both aligning visual and textual features and generating informative reports accordingly. Currently, large language models (LLMs) offered a promising solution to text generation with their power in learning from big data, especially for cross-modal scenarios such as RRG. However, most existing LLMs are pre-trained on general data, and suffer from the same problem of conventional approaches caused by knowledge gap between general and medical domain if they are applied to RRG. Therefore in this paper, we propose an approach to bootstrapping LLMs for RRG with a in-domain instance induction and a coarse-to-fine decoding process. Specifically, the in-domain instance induction process learns to align the LLM to radiology reports from general texts through contrastive learning. The coarse-to-fine decoding performs a text elevating process for those reports from the ranker, further enhanced with visual features and refinement prompts. Experimental results on two prevailing RRG datasets, namely, IU X-Ray and MIMIC-CXR, demonstrate the superiority of our approach to previous state-of-the-art solutions. Further analyses illustrate that, for the LLM, the induction process enables it to better align with the medical domain and the coarse-to-fine generation allows it to conduct more precise text generation. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#546
-
Liu 2024
MAKG: A maritime accident knowledge graph for intelligent accident analysis and management
With the increasing frequency of human activities at sea, maritime accidents are occurring more often. Analyzing and mining maritime accident cases can help uncover the causal mechanisms behind these incidents, thereby enhancing maritime safety. As an emerging technology for knowledge management and mining, knowledge graphs offer significant support for the storage, reasoning, and decision-making processes related to maritime accidents. In this study, we established a knowledge graph construction and application framework for maritime accidents to facilitates the extraction and management of maritime knowledge from unstructured texts. First, 581 accident reports released by the China Maritime Safety Administration over the past decade (2014-2023) were used as the data basis for analysis and construction of the maritime accident ontology structure using the sevenstep method, which comprises 8 entity types, 8 relationship types, and 18 attribute entity types. Second, We proposed MBERT-BiLSTM-CRF-SF, a named entity recognition model based on domain pretraining and selftraining, to reduce graph construction costs. This model achieved state-of-the-art performance in the maritime domain, with an F1 score of 0.910 +/- 0.006, which is about 5% higher than the mainstream model. In addition, we proposed an entity alignment method based on font and semantics to refine knowledge further. On the basis of the proposed method, we constructed a large, high-quality maritime accident knowledge graph (MAKG) system that contains 16,099 entities and 20,809 relationship instances. Finally, we reduced the complexity of applying knowledge graphs by integrating the CRISPE prompt learning framework of the large language model, and experiments on graph traversal, pattern recognition, and aggregation analysis were conducted to assess the quality of MAKG. Results demonstrate that MAKG can effectively enhance the efficiency of querying and reasoning about maritime accident information, thus providing significant support for the prevention and management of maritime accidents. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3409
-
Liu 2023
Examining LLMs' Uncertainty Expression Towards Questions Outside Parametric Knowledge
arXiv 2023;(): 2023 Ref ID: 7949 Can large language models (LLMs) express their uncertainty in situations where they lack sufficient parametric knowledge to generate reasonable responses? This work aims to systematically investigate LLMs' behaviors in such situations, emphasizing the trade-off between honesty and helpfulness. To tackle the challenge of precisely determining LLMs' knowledge gaps, we diagnostically create unanswerable questions containing non-existent concepts or false premises, ensuring that they are outside the LLMs' vast training data. By compiling a benchmark, UnknownBench, which consists of both unanswerable and answerable questions, we quantitatively evaluate the LLMs' performance in maintaining honesty while being helpful. Using a model-agnostic unified confidence elicitation approach, we observe that most LLMs fail to consistently refuse or express uncertainty towards questions outside their parametric knowledge, although instruction fine-tuning and alignment techniques can provide marginal enhancements. Moreover, LLMs' uncertainty expression does not always stay consistent with the perceived confidence of their textual outputs. |
brandon
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#329
-
Liu 2022
Heterogeneous graph prompt for Community Question Answering
Compared with general question answering, Community Question Answering (CQA), which has been widely used in various scenarios like E-commerce and is well welcomed. In order to answer the user's question precisely, many CQA models resort to external knowledge sources such as Wikipedia. The main challenge of the task is knowledge extraction and utilization. Different from the traditional method of designing task-specific knowledge modules, we propose a graph prompt-based learning method that directly steers the pretrained language model to solve CQA tasks. Multiple information sources are organized as graph prompts to guide the generation of the model, naturally leveraging the knowledge learned in the pretrained step. Based on the pretrained bidirectional and autoregressive transformers, a large-scale language model, a comparable performance is achieved with less than 10% of the full-finetuning time by only optimizing the graph prompt parameters. Experiments on two standard CQA datasets show that compared with traditional sequential initialized prompts, graph prompt achieves 20.47% and 14.89% increments in terms of BLEU and ROUGE-L scores on quick finetuning and outperforms in few-shot learning. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1485
-
Liu 2024
KNOWFORMER: Revisiting Transformers for Knowledge Graph Reasoning
Proceedings of Machine Learning Research 2024;235():31669-31690 ML Research Press 2024 Ref ID: 4368 Knowledge graph reasoning plays a vital role in various applications and has garnered considerable attention. Recently, path-based methods have achieved impressive performance. However, they may face limitations stemming from constraints in message-passing neural networks, such as missing paths and information over-squashing. In this paper, we revisit the application of transformers for knowledge graph reasoning to address the constraints faced by path-based methods and propose a novel method KNOWFORMER. KNOWFORMER utilizes a transformer architecture to perform reasoning on knowledge graphs from the message-passing perspective, rather than reasoning by textual information like previous pretrained language model based methods. Specifically, we define the attention computation based on the query prototype of knowledge graph reasoning, facilitating convenient construction and efficient optimization. To incorporate structural information into the self-attention mechanism, we introduce structure-aware modules to calculate query, key, and value respectively. Additionally, we present an efficient attention computation method for better scalability. Experimental results demonstrate the superior performance of KNOWFORMER compared to prominent baseline methods on both transductive and inductive benchmarks. Copyright 2024 by the author(s) |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#876
-
Liu 2022
VoCSK: Verb-oriented commonsense knowledge mining with taxonomy-guided induction
Commonsense knowledge acquisition is one of the fundamental issues in realizing human- level AI. However, commonsense knowledge is difficult to obtain because it is a human consensus and rarely explicitly appears in texts or other data. In this paper, we focus on the automatic acquisition of a typical kind of implicit verb-oriented commonsense knowledge (e.g., "person eats food "), which is the concept-level knowledge of verb phrases. For this purpose, we propose a taxonomy-guided induction method to mine verb-oriented commonsense knowledge from verb phrases with the help of a probabilistic taxonomy. First, we design an entropy-based triplet filter to cope with noisy verb phrases. Then, we propose a joint model based on the minimum description length principle and a neural language model to generate verb-oriented commonsense knowledge. Besides, we introduce two strategies to accelerate the computation, including the simulated annealing-based approximate solution and the verb phrase clustering method. Finally, we conduct extensive experiments to prove that our solution is more effective than competitors in mining verb-oriented commonsense knowledge. We construct a commonsense knowledge base called VoCSK, containing 259 verbs and 18,406 verb-oriented commonsense knowledge. To verify the usefulness of VoCSK, we utilize the knowledge in this KB to improve the model performance on two downstream applications. (C) 2022 Elsevier B.V. All rights reserved. |
Xinchen
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#408
-
Liu 2023
KEPT: Knowledge Enhanced Prompt Tuning for event causality identification
Event causality identification (ECI) aims to identify causal relations of event mention pairs in text. Despite achieving certain accomplishments, existing methods are still not effective due to the following two issues: (1) the lack of causal reasoning ability, imposing restrictions on recognizing implicit causal relations; (2) the significant gap between fine-tuning and pre-training, which hinders the utilization of pre-trained language models (PLMs). In this paper, we propose a novel Knowledge Enhanced Prompt Tuning (KEPT) framework for ECI to address the issues mentioned above. Specifically, this method leverages prompt tuning to incorporate two kinds of knowledge obtained from external knowledge bases (KBs), including background information and relational information, for causal reasoning. To introduce external knowledge into our model, we first convert it to textual descriptions, then design an interactive attention mechanism and a selective attention mechanism to fuse background information and relational information, respectively. In addition, to further capture implicit relations between events, we adopt the objective from knowledge representation learning to jointly optimize the representations of causal relations and events. Experiment results on two widely-used benchmarks demonstrate that the proposed method outperforms the state-of-the-art models.(c) 2022 Elsevier B.V. All rights reserved. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1123
-
Liu 2023
Constructing Knowledge Graph from Cyber Threat Intelligence Using Large Language Model
Proceedings - 2023 IEEE International Conference on Big Data, BigData 2023 2023;():516-521 Institute of Electrical and Electronics Engineers Inc. 2023 DOI: 10.1109/BigData59044.2023.10386611 · Ref ID: 4987 Cyber Threat Intelligence (CTI) reports are valuable resources in various applications but manually extracting information from them is time-consuming. Existing approaches for automating extraction require specialized models trained on a substantial corpus. In this paper, we present an efficient methodology for constructing knowledge graphs from CTI by leveraging the Large Language Model (LLM), using ChatGPT for instance. Our approach automatically extracts attack-related entities and their relationships, organizing them within a CTI knowledge graph. We evaluate our approach on 13 CTIs, demonstrating better performance compared to AttacKG and REBEL while requiring less manual intervention and computational resources. This proves the feasibility and suitability of our method in low-resource scenarios, specifically within the domain of cyber threat intelligence. © 2023 IEEE. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3806
-
Liu 2022
Rainier: Reinforced Knowledge Introspector for Commonsense Question Answering
arXiv 2022;(): 2022 Ref ID: 7584 Knowledge underpins reasoning. Recent research demonstrates that when relevant knowledge is provided as additional context to commonsense question answering (QA), it can substantially enhance the performance even on top of state-of-the-art. The fundamental challenge is where and how to find such knowledge that is high quality and on point with respect to the question; knowledge retrieved from knowledge bases are incomplete and knowledge generated from language models are inconsistent. We present Rainier, or Reinforced Knowledge Introspector, that learns to generate contextually relevant knowledge in response to given questions. Our approach starts by imitating knowledge generated by GPT-3, then learns to generate its own knowledge via reinforcement learning where rewards are shaped based on the increased performance on the resulting question answering. Rainier demonstrates substantial and consistent performance gains when tested over 9 different commonsense benchmarks: including 5 datasets that are seen during model training, as well as 4 datasets that are kept unseen. Our work is the first to report that knowledge generated by models that are orders of magnitude smaller than GPT-3, even without direct supervision on the knowledge itself, can exceed the quality of commonsense knowledge elicited from GPT-3. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3351
-
Liu 2024
DRAK: Unlocking Molecular Insights with Domain-Specific Retrieval-Augmented Knowledge in LLMs
arXiv 2024;(): 2024 Ref ID: 8427 Large Language Models (LLMs) encounter challenges with the unique syntax of specific domains, such as biomolecules. Existing fine-tuning or modality alignment techniques struggle to bridge the domain knowledge gap and understand complex molecular data, limiting LLMs' progress in specialized fields. To overcome these limitations, we propose an expandable and adaptable non-parametric knowledge injection framework named Domain-specific Retrieval-Augmented Knowledge (DRAK), aimed at enhancing reasoning capabilities in specific domains. Utilizing knowledge-aware prompts and gold label-induced reasoning, DRAK has developed profound expertise in the molecular domain and the capability to handle a broad spectrum of analysis tasks. We evaluated two distinct forms of DRAK variants, proving that DRAK exceeds previous benchmarks on six molecular tasks within the Mol-Instructions dataset. Extensive experiments have underscored DRAK's formidable performance and its potential to unlock molecular insights, offering a unified paradigm for LLMs to tackle knowledge-intensive tasks in specific domains. Our code will be available soon. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#667
-
Liu 2024
PrimeNet: A Framework for Commonsense Knowledge Representation and Reasoning Based on Conceptual Primitives
Commonsense knowledge acquisition and representation is a core topic in artificial intelligence (AI), which is crucial for building more sophisticated and human-like AI systems. However, existing commonsense knowledge bases organize facts in an isolated manner like bag of facts, lacking the cognitive-level connections that humans commonly possess. People have the ability to efficiently organize vast amounts of knowledge by linking or generalizing concepts using a limited set of conceptual primitives that serve as the fundamental building blocks of reasoning. These conceptual primitives are basic, foundational elements of thought that humans use to make sense of the world. By combining and recombining these primitives, people can construct complex ideas, solve problems, and understand new concepts. To emulate this cognitive mechanism, we design a new commonsense knowledge base, termed PrimeNet, organized in a three-layer structure: a small core of conceptual primitives (e.g., FOOD), a bigger set of concepts that connect to such primitives (e.g., fruit), and an even larger layer of entities connecting to the concepts (e.g., banana). First, we collect commonsense knowledge and employ a gradual expansion strategy for knowledge integration. After refinement, PrimeNet contains 6 million edges between 2 million nodes, with 34 different types of relations. Then, we design a new conceptualization method by leveraging a probabilistic taxonomy, to build the concept layer of PrimeNet. Finally, we conduct primitive detection to build the primitive layer, where a lexical substitution task is used to identify related concepts, and large language models are employed to generate a rational primitive to label each concept cluster as well as verify the primitive detection process. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#709
-
Liu 2022
Relational Memory-Augmented Language Models
We present a memory-augmented approach to condition an autoregressive language model on a knowledge graph. We represent the graph as a collection of relation triples and retrieve relevant relations for a given context to improve text generation. Experiments on WikiText-103, WMT19, and enwik8 English datasets demonstrate that our approach produces a better language model in terms of perplexity and bits per character. We also show that relational memory improves coherence, is complementary to token-based memory, and enables causal interventions. Our model provides a simple yet effective way to combine an autoregressive language model and a knowledge graph for more coherent and logical generation. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1968
-
Liu 2024
Towards Improving Interpretability of Language Model Generation through a Structured Knowledge Discovery Approach
Knowledge-enhanced text generation aims to enhance the quality of generated text by utilizing internal or external knowledge sources. While language models have demonstrated impressive capabilities in generating coherent and fluent text, the lack of interpretability presents a substantial obstacle. The limited interpretability of generated text significantly impacts its practical usability, particularly in knowledge-enhanced text generation tasks that necessitate reliability and explainability. Existing methods often employ domain-specific knowledge retrievers that are tailored to specific data characteristics, limiting their generalizability to diverse data types and tasks. To overcome this limitation, we directly leverage the two-tier architecture of structured knowledge, consisting of high-level entities and lowlevel knowledge triples, to design our task-agnostic structured knowledge hunter. Specifically, we employ a local-global interaction scheme for structured knowledge representation learning and a hierarchical transformer-based pointer network as the backbone for selecting relevant knowledge triples and entities. By combining the strong generative ability of language models with the high faithfulness of the knowledge hunter, our model achieves high interpretability, enabling users to comprehend the model's output generation process. Furthermore, we empirically demonstrate the effectiveness of our model in both internal knowledge-enhanced table-to-text generation on the RotoWire- FG dataset and external knowledge-enhanced dialogue response generation on the KdConv dataset. Our task-agnostic model outperforms state-of-the-art methods and corresponding language models, setting new standards on the benchmark. IEEE |
Mike
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#890
-
Liu 2023
Zero-Shot Text Classification with Semantically Extended Textual Entailment
International Joint Conference on Neural Networks (IJCNN) 2023;(): Broadbeach, AUSTRALIA Ieee 2023 DOI: 10.1109/ijcnn54540.2023.10191094 · Ref ID: 3373 Zero-shot text classification (0SHOT-TC) aims to detect classes that the model never seen in the training set, and has attracted much attention in the research community of Natural Language Processing (NLP). The emergence of pretrained language models has fostered the progress of 0SHOT-TC, which turns the task into a textual entailment problem of binary classification. It learns an entailment relatedness (yes/no) between the given sentence (premise) and each category (hypothesis) separately. However, the hypothesis generation paradigms need to be further studied, since the label itself or the label descriptions have limited ability to fully express the category space. Conversely, humans can easily extend a set of words describing the categories to be classified. In this paper, we propose a novel zero-shot text classification method called Semantically Extended Textual Entailment (SETE), which imitates the human's ability in knowledge extension. In the proposed method, three semantic extension methods are used to enrich the categories through a combination of static knowledge (e.g. expert knowledge, knowledge graph) and dynamic knowledge (e.g. language models), and the textual entailment model is finally used for 0SHOT-TC. The experimental results on the benchmarks show that our approach significantly outperforms the current methods in both generalized and nongeneralized 0SHOT-TC. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3403
-
Liu 2024
Evaluating the Factuality of Large Language Models using Large-Scale Knowledge Graphs
arXiv 2024;(): 2024 Ref ID: 8209 The advent of Large Language Models (LLMs) has significantly transformed the AI landscape, enhancing machine learning and AI capabilities. Factuality issue is a critical concern for LLMs, as they may generate factually incorrect responses. In this paper, we propose GraphEval to evaluate an LLM's performance using a substantially large test dataset. Specifically, the test dataset is retrieved from a large knowledge graph with more than 10 million facts without expensive human efforts. Unlike conventional methods that evaluate LLMs based on generated responses, GraphEval streamlines the evaluation process by creating a judge model to estimate the correctness of the answers given by the LLM. Our experiments demonstrate that the judge model's factuality assessment aligns closely with the correctness of the LLM's generated outputs, while also substantially reducing evaluation costs. Besides, our findings offer valuable insights into LLM performance across different metrics and highlight the potential for future improvements in ensuring the factual integrity of LLM outputs. The code is publicly available at https://github.com/xz-liu/GraphEval. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#418
-
Liu 2023
Knowledge Base Question Answering via Semantic Analysis
Knowledge Question Answering is one of the important research directions in the field of robot intelligence. It is mainly based on background knowledge to analyze users' questions and generate answers. It is one of the important application methods of knowledge graph technology. Compared with the traditional expert system of question and answer, it has the advantage of a large-scale background knowledge base and the traceability and interpretability of the question-answering process. Compared with the current ChatGPT (Chat Generative Pre-trained Transformer) technology, it has advantages in the proprietary segmentation field. Aiming at the problem of the accuracy of existing knowledge question-answering methods being low, this paper studies the method of semantic analysis for knowledge question-answering under the support of a knowledge database, proposes a knowledge question-answering method based on the superposition of multiple neural network models, and conducts experimental verification on the publicly available NLPCC2016KBQA(Knowledge Q&A Tasks in the 2016 Natural Language Processing and Chinese Computing Conference) data set. The experimental results show that the F1 value of this method is higher than that of the baseline model. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#542
-
Liu 2023
Local and Global: Temporal Question Answering via Information Fusion
32nd International Joint Conference on Artificial Intelligence (IJCAI) 2023;():5141-5149 Macao, PEOPLES R CHINA Ijcai-Int Joint Conf Artif Intell 2023 Ref ID: 3489 Many models that leverage knowledge graphs (KGs) have recently demonstrated remarkable success in question answering (QA) tasks. In the real world, many facts contained in KGs are time-constrained thus temporal KGQA has received increasing attention. Despite the fruitful efforts of previous models in temporal KGQA, they still have several limitations. (I) They neither emphasize the graph structural information between entities in KGs nor explicitly utilize a multi-hop relation path through graph neural networks to enhance answer prediction. (II) They adopt pre-trained language models (LMs) to obtain question representations, focusing merely on the global information related to the question while not highlighting the local information of the entities in KGs. To address these limitations, we introduce a novel model that simultaneously explores both Local information and Global information for the task of temporal KGQA (LGQA). Specifically, we first introduce an auxiliary task in the temporal KG embedding procedure to make timestamp embeddings time-order aware. Then, we design information fusion layers that effectively incorporate local and global information to deepen question understanding. We conduct extensive experiments on two benchmarks, and LGQA significantly outperforms previous state-of-the-art models, especially in difficult questions. Moreover, LGQA can generate interpretable and trustworthy predictions. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#123
-
Liu 2020
Commonsense Evidence Generation and Injection in Reading Comprehension
21st Annual Meeting of the Special-Interest-Group-on-Discourse-and-Dialogue (SIGDIAL) 2020;():61-73 Electr Network Assoc Computational Linguistics 2020 Ref ID: 3433 Human tackle reading comprehension not only based on the given context itself but often rely on the commonsense beyond. To empower the machine with commonsense reasoning, in this paper, we propose a Commonsense Evidence Generation and Injection framework in reading comprehension, named CEGI. The framework injects two kinds of auxiliary commonsense evidence into comprehensive reading to equip the machine with the ability of rational thinking. Specifically, we build two evidence generators: one aims to generate textual evidence via a language model; the other aims to extract factual evidence (automatically aligned text-triples) from a commonsense knowledge graph after graph completion. Those evidences incorporate contextual commonsense and serve as the additional inputs to the reasoning model. Thereafter, we propose a deep contextual encoder to extract semantic relationships among the paragraph, question, option, and evidence. Finally, we employ a capsule network to extract different linguistic units (word and phrase) from the relations, and dynamically predict the optimal option based on the extracted units. Experiments on the CosmosQA dataset demonstrate that the proposed CEGI model outperforms the current state-of-the-art approaches and achieves the highest accuracy (83.6%) on the leaderboard. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3740
-
Lo 2023
On Exploring the Reasoning Capability of Large Language Models with Knowledge Graphs
arXiv 2023;(): 2023 Ref ID: 7964 This paper examines the capacity of LLMs to reason with knowledge graphs using their internal knowledge graph, i.e., the knowledge graph they learned during pre-training. Two research questions are formulated to investigate the accuracy of LLMs in recalling information from pre-training knowledge graphs and their ability to infer knowledge graph relations from context. To address these questions, we employ LLMs to perform four distinct knowledge graph reasoning tasks. Furthermore, we identify two types of hallucinations that may occur during knowledge reasoning with LLMs: content and ontology hallucination. Our experimental results demonstrate that LLMs can successfully tackle both simple and complex knowledge graph reasoning tasks from their own memory, as well as infer from input context. |
yuexi
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#494
-
Lombardo 2024
Language Models Fine-Tuning for Automatic Format Reconstruction of SEC Financial Filings
The analysis of financial reports is a crucial task for investors and regulators, especially the mandatory annual reports (10-K) required by the SEC (Securities and Exchange Commission) that provide crucial information about a public company in the American stock market. Although SEC suggests a specific document format to standardize and simplify the analysis, in recent years, several companies have introduced their own format and organization of the contents, making human-based and automatic knowledge extraction inherently more difficult. In this research work, we investigate different Neural language models based on Transformer networks (Bidirectional recurrence-based, Autoregressive-based, and Autoencoders-based approaches) to automatically reconstruct an SEC-like format of the documents as a multi-class classification task with 18 classes at the sentence level. In particular, we propose a Bidirectional fine-tuning procedure to specialize pre-trained language models on this task. We propose and make the resulting novel transformer model, named SEC-former, publicly available to deal with this task. We evaluate SEC-former in three different scenarios: 1) in terms of topic detection performances; 2) in terms of document similarity (TF-IDF Bag-of-words and Doc2Vec) achieved with respect to original and trustable financial reports since this operation is leveraged for portfolio optimization tasks; and 3) testing the model in a real use-case scenario related to a public company that does not respect the SEC format but provides a human-supervised reference to reconstruct it. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#789
-
Lonergan 2024
Stratified Evaluation of Large Language Model GPT-4's Question-Answering In Surgery reveals AI Knowledge Gaps
|
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#788
-
Lonergan 2023
Stratified Evaluation of GPT's Question Answering in Surgery Reveals Artificial Intelligence (AI) Knowledge Gaps
Large language models (LLMs) have broad potential applications in medicine, such as aiding with education, providing reassurance to patients, and supporting clinical decision-making. However, there is a notable gap in understanding their applicability and performance in the surgical domain and how their performance varies across specialties. This paper aims to evaluate the performance of LLMs in answering surgical questions relevant to clinical practice and to assess how this performance varies across different surgical specialties.We used the MedMCQA dataset, a large-scale multi-choice question-answer (MCQA) dataset consisting of clinical questions across all areas of medicine. We extracted the relevant 23,035 surgical questions and submitted them to the popular LLMs Generative Pre-trained Transformers (GPT)-3.5 and GPT-4 (OpenAI OpCo, LLC, San Francisco, CA). Generative Pre-trained Transformer is a large language model that can generate human-like text by predicting subsequent words in a sentence based on the context of the words that come before it. It is pre-trained on a diverse range of texts and can perform a variety of tasks, such as answering questions, without needing task-specific training. The question-answering accuracy of GPT was calculated and compared between the two models and across surgical specialties. Both GPT-3.5 and GPT-4 achieved accuracies of 53.3% and 64.4%, respectively, on surgical questions, showing a statistically significant difference in performance. When compared to their performance on the full MedMCQA dataset, the two models performed differently: GPT-4 performed worse on surgical questions than on the dataset as a whole, while GPT-3.5 showed the opposite pattern. Significant variations in accuracy were also observed across different surgical specialties, with strong performances in anatomy, vascular, and paediatric surgery and worse performances in orthopaedics, ENT, and neurosurgery.Large language models exhibit promising capabilities in addressing surgical questions, although the variability in their performance between specialties cannot be ignored. The lower performance of the latest GPT-4 model on surgical questions relative to questions across all medicine highlights the need for targeted improvements and continuous updates to ensure relevance and accuracy in surgical applications. Further research and continuous monitoring of LLM performance in surgical domains are crucial to fully harnessing their potential and mitigating the risks of misinformation. |
Kwesi
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1648
-
Lorenzo 2024
Mitigating Data Scarcity in Semantic Parsing across Languages: the Multilingual Semantic Layer and its Dataset
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():14056-14080 Association for Computational Linguistics (ACL) 2024 Ref ID: 4220 Data scarcity is a prevalent challenge in the era of Large Language Models (LLMs). The insatiable hunger of LLMs for large corpora becomes even more pronounced when dealing with non-English and low-resource languages. The issue is particularly exacerbated in Semantic Parsing (SP), i.e. the task of converting text into a formal representation. The complexity of semantic formalisms makes training human annotators and subsequent data annotation unfeasible on a large scale, especially across languages. To mitigate this, we first introduce the Multilingual Semantic Layer (MSL), a conceptual evolution of previous formalisms, which decouples from disambiguation and external inventories and simplifies the task. MSL provides the necessary tools to encode the meaning across languages, paving the way for developing a high-quality semantic parsing dataset across different languages in a semi-automatic strategy. Subsequently, we manually refine a portion of this dataset and fine-tune GPT-3.5 to propagate these refinements across the dataset. Then, we manually annotate 1,100 sentences in eleven languages, including low-resource ones. Finally, we assess our dataset's quality, showcasing the performance gap reduction across languages in Semantic Parsing. Our code and dataset are openly available at https://github.com/SapienzaNLP/MSL. © 2024 Association for Computational Linguistics. |
Kwesi
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1878
-
Lotfy 2024
Sentiment Analysis for Arabic Product Reviews using LLMs and Knowledge Graphs
6th International Conference on Computing and Informatics, ICCI 2024 2024;():411-417 Institute of Electrical and Electronics Engineers Inc. 2024 DOI: 10.1109/ICCI61671.2024.10485037 · Ref ID: 4687 The exploration of sentiment analysis in multilingual contexts, particularly through the integration of deep learning techniques and knowledge graphs, represents a significant advance in language processing research. This study specifically concentrates on the Arabic language, addressing the challenges presented by its morphological complexity. While the primary focus is Arabic, the research also includes a comprehensive review of related work in other languages such as Bangla and Chinese. This contextualizes the challenges and solutions found in Arabic sentiment analysis within a broader multilingual landscape. Utilizing pre-trained language models like BERT, the research has achieved noteworthy improvements in sentiment analysis accuracy and efficiency, particularly for the Arabic language. The integration of knowledge graphs stands out as a crucial innovation, offering essential contextual insights and mitigating the limitations posed by sparse labeled datasets in Arabic, a language less resourced compared to English. The findings of this study highlight the effectiveness of tailored BERT models for Arabic sentiment analysis, revealing the vast potential and inherent challenges of employing knowledge graphs and large language models for a deeper, more nuanced understanding. The future direction of this research includes enhancing these methods with cutting-edge machine learning techniques, aiming to further refine sentiment analysis processes and knowledge graph construction with a focus on Arabic within a multilingual framework. © 2024 IEEE. |
Mike
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#856
-
Lourie 2021
UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New Multitask Benchmark
35th AAAI Conference on Artificial Intelligence / 33rd Conference on Innovative Applications of Artificial Intelligence / 11th Symposium on Educational Advances in Artificial Intelligence 2021;35():13480-13488 Electr Network Assoc Advancement Artificial Intelligence 2021 Ref ID: 3543 Commonsense AI has long been seen as a near impossible goal until recently. Now, research interest has sharply increased with an influx of new benchmarks and models. We propose two new ways to evaluate commonsense models, emphasizing their generality on new tasks and building on diverse, recently introduced benchmarks. First, we propose a new multitask benchmark, RAINBOW, to promote research on commonsense models that generalize well over multiple tasks and datasets. Second, we propose a novel evaluation, the cost equivalent curve, that sheds new insight on how the choice of source datasets, pretrained language models, and transfer learning methods impacts performance and data efficiency. We perform extensive experiments over 200 experiments encompassing 4800 models and report multiple valuable and sometimes surprising findings, e.g., that transfer almost always leads to better or equivalent performance if following a particular recipe, that QA -based commonsense datasets transfer well with each other, while commonsense knowledge graphs do not, and that perhaps counter-intuitively, larger models benefit more from transfer than smaller ones. Last but not least, we introduce a new universal commonsense reasoning model, UNICORN, that establishes new state-of-the-art performance across 8 popular commonsense benchmarks, ALI (-*873%), CosmosQA (-*91.8%), HELLASWAG (-*93.9%), PIQA (-*90.1%), SociALIQA (-*83.2%), WINOGRANDE (-*86.6%), CYCIC (-*94.0%) and CommoNsENsEQA (-*79.3%). |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1330
-
Lovelace 2022
A Framework for Adapting Pre-Trained Language Models to Knowledge Graph Completion
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 2022;():5937-5955 Association for Computational Linguistics (ACL) 2022 Ref ID: 5468 Recent work has demonstrated that entity representations can be extracted from pre-trained language models to develop knowledge graph completion models that are more robust to the naturally occurring sparsity found in knowledge graphs. In this work, we conduct a comprehensive exploration of how to best extract and incorporate those embeddings into knowledge graph completion models. We explore the suitability of the extracted embeddings for direct use in entity ranking and introduce both unsupervised and supervised processing methods that can lead to improved downstream performance. We then introduce supervised embedding extraction methods that can extract more informative representations. We then synthesize our findings and develop a knowledge graph completion model that significantly outperforms recent neural models. © 2022 Association for Computational Linguistics. |
yuexi
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2766
-
Lozano 2014
Ontology View Extraction: An Approach Based on Ontological Meta-properties
2014 IEEE 26th International Conference on Tools with Artificial Intelligence 2014;():122-129 2014 DOI: 10.1109/ICTAI.2014.28 · Ref ID: 6111 Ontologies have been applied in Computer Science to ensure the semantic interoperability among multiple systems. With the increasing of ontologies availability, many approaches for promoting the share and reuse of ontologies have been investigated in recent years, like ontology module extraction (modularization) and ontology view extraction. Approaches for ontology module extraction are used for extracting modules in large ontologies. On the other hand, ontology views have been used for providing to the user only the parts of the ontology that are useful for a given task. Thus, both ontology views and ontology modules encapsulate a subset of the original ontology, but they have different purposes. The literature has explored the ontological meta-properties (such as identity and rigidity) for guiding the modeling decisions that are made during the ontology engineering process. Some of these meta-properties were formalized in foundational ontologies like UFO (Unified foundational ontology). In this paper, we explore the use of ontological meta-properties for extracting ontology views. We propose a characterization of the notion of well-founded ontology view, considering the ontological meta-properties of the concepts. Besides that, we also propose a language-independent algorithm for sub-ontology extraction that is guided by ontological meta-properties. Finally, we present a case study for illustrating the application of our approach. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#334
-
Lu 2023
HiPrompt: Few-Shot Biomedical Knowledge Fusion via Hierarchy-Oriented Prompting
46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) 2023;():2052-2056 Taipei, TAIWAN Assoc Computing Machinery 2023 DOI: 10.1145/3539618.3591997 · Ref ID: 3437 Medical decision-making processes can be enhanced by comprehensive biomedical knowledge bases, which require fusing knowledge graphs constructed from different sources via a uniform index system. The index system often organizes biomedical terms in a hierarchy to provide the aligned entities with fine-grained granularity. To address the challenge of scarce supervision in the biomedical knowledge fusion (BKF) task, researchers have proposed various unsupervised methods. However, these methods heavily rely on ad-hoc lexical and structural matching algorithms, which fail to capture the rich semantics conveyed by biomedical entities and terms. Recently, neural embedding models have proved effective in semantic-rich tasks, but they rely on sufficient labeled data to be adequately trained. To bridge the gap between the scarce-labeled BKF and neural embedding models, we propose HiPrompt, a supervision-efficient knowledge fusion framework that elicits the few-shot reasoning ability of large language models through hierarchy-oriented prompts. Empirical results on the collected KG-Hi-BKF benchmark datasets demonstrate the effectiveness of HiPrompt. |
Mike
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3463
-
Lu 2024
Generative Design of Functional Metal Complexes Utilizing the Internal Knowledge of Large Language Models
arXiv 2024;(): 2024 Ref ID: 8746 Designing functional transition metal complexes (TMCs) faces challenges due to the vast search space of metals and ligands, requiring efficient optimization strategies. Traditional genetic algorithms (GAs) are commonly used, employing random mutations and crossovers driven by explicit mathematical objectives to explore this space. Transferring knowledge between different GA tasks, however, is difficult. We integrate large language models (LLMs) into the evolutionary optimization framework (LLM-EO) and apply it in both single- and multi-objective optimization for TMCs. We find that LLM-EO surpasses traditional GAs by leveraging the chemical knowledge of LLMs gained during their extensive pretraining. Remarkably, without supervised fine-tuning, LLMs utilize the full historical data from optimization processes, outperforming those focusing only on top-performing TMCs. LLM-EO successfully identifies eight of the top-20 TMCs with the largest HOMO-LUMO gaps by proposing only 200 candidates out of a 1.37 million TMCs space. Through prompt engineering using natural language, LLM-EO introduces unparalleled flexibility into multi-objective optimizations, thereby circumventing the necessity for intricate mathematical formulations. As generative models, LLMs can suggest new ligands and TMCs with unique properties by merging both internal knowledge and external chemistry data, thus combining the benefits of efficient optimization and molecular generation. With increasing potential of LLMs as pretrained foundational models and new post-training inference strategies, we foresee broad applications of LLM-based evolutionary optimization in chemistry and materials design. |
yuexi
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#654
-
Lu 2023
PKAT: Pre -training in Collaborative Knowledge Graph Attention Network for Recommendation
23rd IEEE International Conference on Data Mining (IEEE ICDM) 2023;():448-457 Shanghai, PEOPLES R CHINA Ieee Computer Soc 2023 DOI: 10.1109/icdm58522.2023.00054 · Ref ID: 3011 With the rapid growth of online platforms and the abundance of available information, personalized recommender systems have become essential for assisting users in discovering relevant and interesting content. Among the various methods, knowledge -aware recommendation model has achieved notable success by leveraging the rich semantic information encoded in knowledge graphs. However, it overlooks the fact that users' historical click sequences can better reflect their preferences within a period of time, thus imposing certain limitations on the recommendation performance. On the other hand, the application of pre-trained language models in recommender systems has demonstrated increasingly significant potential, as they can capture sequential patterns and dependencies within users' historical click sequences and effectively capture contextual information in user-item interactions. To this end, we propose a hybrid recommendation model that leverages Pre-training in the collaborative Knowledge graph Attention neTwork (PKAT), to extract both the high-order connectivity information in collaborative knowledge graphs and the contextual information in users' historical click sequences captured by Bidirectional Encoder Representations from Transformers (BERT). The collaborative knowledge graph attention network enables the model to effectively capture the intricate relationships between users, items, and knowledge entities, thus enhancing the representation learning process. Furthermore, what sets PKAT apart from other state-of-the-art knowledgeaware recommendation methods is the incorporation of the BERT language model. This integration allows PKAT to capture the contextual sequence information of user behavior, enabling it to generate more accurate and personalized recommendations. Extensive experiments are conducted on multiple benchmark dalasets. And the results demonstrate that our PKAT model outperforms several state-of-the-art baselines. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1014
-
Lu 2003
Automatically acquiring Chinese parsing knowledge based on a bilingual language model
Jisuanji Xuebao 2003;26(1):32-38 2003 Ref ID: 5825 Knowledge acquisition is a bottleneck for real application of Chinese Parsing. This paper presents a new method to acquire Chinese parsing knowledge from sentence aligned English-Chinese bilingual corpora. Using English parsing and word alignment results, this method first implements bilingual structure alignment based on a bilingual language model-Inversion Transduction Grammars. Then, Chinese Bracketing structures are extracted automatically. The method creates structure bracketing Chinese corpora by taking full advantage of English parsing and bilingual corpora. The created corpora are very useful for further Chinese corpus annotation and parsing knowledge acquisition. Preliminary experiments show that the acquired knowledge accord well with manually made knowledge. This method is particularly useful to acquire parsing knowledge for a language lacking of studied from a second language that well studied. Although this paper is related to Chinese and English, the proposed method is also applicable to other language pairs. |
Ishan
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1081
-
Lu 2024
ClinicalRAG: Enhancing Clinical Decision Support through Heterogeneous Knowledge Retrieval
KnowLLM 2024 - 1st Workshop on Towards Knowledgeable Language Models, Proceedings of the Workshop 2024;():64-68 Association for Computational Linguistics (ACL) 2024 Ref ID: 4281 Large Language Models (LLMs) have revolutionized text generation across diverse domains, showcasing an ability to mimic human-like text with remarkable accuracy. Yet, these models frequently encounter a significant hurdle: producing hallucinations, a flaw particularly detrimental in the healthcare domain where precision is crucial. In this paper, we introduce ClinicalRAG, a novel multi-agent pipeline to rectify this issue by incorporating heterogeneous medical knowledge-both structured and unstructured-into LLMs to bolster diagnosis accuracy. ClinicalRAG can extract related medical entities from user inputs and dynamically integrate relevant medical knowledge during the text generation process. Comparative analyses reveal that ClinicalRAG significantly outperforms knowledge-deficient methods, offering enhanced reliability in clinical decision support. This advancement marks a pivotal proof-of-concept step towards mitigating misinformation risks in healthcare applications of LLMs. © 2024 Association for Computational Linguistics. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1218
-
Lu 2024
Dynamic Reasoning with Language Model and Knowledge Graph for Question Answering
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2024;14807 LNCS():441-455 Springer Science and Business Media Deutschland GmbH 2024 DOI: 10.1007/978-3-031-70546-5_26 · Ref ID: 4213 The question answering(QA) involves reasoning about the context and latent knowledge of complex textual descriptions. Current research is how to effectively utilize knowledge graph(KG) to enhance language model(LM) with external knowledge. In previous works, the interactions between the QA context and KG were limited, and KG input to the model contained noisy nodes, greatly restricting the model’s reasoning ability. We propose a dynamic reasoning model, DLM-KG, which is based on LM and KG. It resolves the above challenges through dynamic hierarchical interaction between QA context and KG, joint reasoning between LM and KG, and dynamic pruning of the KG. Specifically, DLM-KG extracts hierarchical features from KG representations and performs inter-layer and intra-layer interactions in each iteration. The features from interactions enter the joint reasoning module, where each QA context feature and KG feature mutually attend to each other. The representations of the two modalities are fused and updated through multi-step interactions. Finally, using the information provided by the interaction layer, irrelevant nodes in the KG are removed. Experiments conducted on the commonsense datasets CommonsenseQA, OpenbookQA, and the medical question and answer dataset MedQA-USMLE show that the performance on the MedQA-USMLE dataset is superior to baseline models, and on other datasets, the performance is close to baseline models, demonstrating its competitiveness in terms of reasoning ability. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1126
-
Lu 2021
Construction of Diabetes Knowledge Graph Based on Deep Learning
Proceedings - 2021 7th Annual International Conference on Network and Information Systems for Computers, ICNISC 2021 2021;():966-970 Institute of Electrical and Electronics Engineers Inc. 2021 DOI: 10.1109/ICNISC54316.2021.00181 · Ref ID: 5664 To integrate medical data which is scattered over the internet, natural language processing (NLP) is widely used in medical text mining. BERT (Bidirectional Encoder Representations from Transformers) is outstanding among many other representation models and vector representation based on Bert pre-Training language model can help the target task learn more semantic information. The knowledge graph intuitively reveals the relationship between entities and helps explore deeper semantic connections between entities. There are three important parts in the construction of a knowledge graph, including entity extraction, relation extraction, and graph generation. Based on these methods this paper proposes a Bert-based named entities identification model Bert-BiLSTM-CRF and it is outperforming the established methods. In the relation extraction part, use the BERT-Softmax to improve the semantic expression and its F1-value increased by 12 percent compared with the traditional entity relation extraction model. Based on the above redefined the entities of diabetes and their relationships to enrich the semantics of the knowledge graph. Finally, the Neo4j graph database was used to realize the visualization of the diabetes knowledge map. © 2021 IEEE. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3884
-
Lu 2022
Structured Knowledge Grounding for Question Answering
arXiv 2022;(): 2022 Ref ID: 7579 Can language models (LM) ground question-answering (QA) tasks in the knowledge base via inherent relational reasoning ability? While previous models that use only LMs have seen some success on many QA tasks, more recent methods include knowledge graphs (KG) to complement LMs with their more logic-driven implicit knowledge. However, effectively extracting information from structured data, like KGs, empowers LMs to remain an open question, and current models rely on graph techniques to extract knowledge. In this paper, we propose to solely leverage the LMs to combine the language and knowledge for knowledge based question-answering with flexibility, breadth of coverage and structured reasoning. Specifically, we devise a knowledge construction method that retrieves the relevant context with a dynamic hop, which expresses more comprehensivenes than traditional GNN-based techniques. And we devise a deep fusion mechanism to further bridge the information exchanging bottleneck between the language and the knowledge. Extensive experiments show that our model consistently demonstrates its state-of-the-art performance over CommensenseQA benchmark, showcasing the possibility to leverage LMs solely to robustly ground QA into the knowledge base. |
yuexi
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1640
-
Luan 2024
A Methodology for Generating and Optimizing Chain-of-Thought Based on Knowledge Graphs
Advances in Transdisciplinary Engineering 2024;47():313-324 IOS Press BV 2024 DOI: 10.3233/ATDE231203 · Ref ID: 4126 One of the critical indicators for assessing the practical applicability of large language models is their competency in vertical domain question-answering tasks. However, in real-world applications, fine-tuning these large models often compromises their inherent capabilities. Moreover, fine-tuning does not offer precise control over the model's generated outputs.Consequently, enhancing the question-answering performance of large models in specialized domains has become a focal concern in the field. To address these challenges, this paper introduces a novel approach for generating and optimizing a 'Chain-of-Thought'(CoT), leveraging domain-specific knowledge graphs. Specifically, we propose a Knowledge Graph-generated Chain of Thought (KGCoT) method that utilizes graph search algorithms to generate a chain of thought. This chain guides the injection of specialized knowledge into large language models and adapts the weightings based on user feedback, thereby optimizing subsequent graph searches.Heuristic searches are performed on the knowledge graph based on edge weights, culminating in the amalgamation of discovered entities and knowledge into a chain of thought. This KGCoT serves as a prompt to stimulate the large language model's contemplation of domain-specific knowledge. Additionally, an adaptive weight optimization formula refines the chain's weights in response to output feedback, thereby continually enhancing the quality of future search results and ensuring real-time optimization capabilities for the model.Through empirical evaluations conducted on publicly available datasets, the large language model ChatGLM, when prompted with a KGCoT, exhibited a 72.8% improvement in its BLEU score compared to its baseline performance. This outperformed other models like LLaMA and RWKV, unequivocally substantiating the efficacy of the proposed KGCoT method. © 2024 The Authors. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1965
-
LuísFerreira 2024
Towards Automated Evaluation of Knowledge Encoded in Large Language Models
Proceedings of the Workshop on DLnLD 2024: Deep Learning and Linked Data at LREC-COLING 2024 - Workshop Proceedings 2024;():76-85 European Language Resources Association (ELRA) 2024 Ref ID: 4642 Large Language Models (LLMs) have a significant user base and are gaining increasing interest and impact across various domains. Given their expanding influence, it is crucial to implement appropriate guardrails or controls to ensure ethical and responsible use. In this paper, we propose to automate the evaluation of the knowledge stored in LLMs. This is achieved by generating datasets tailored for this specific purpose, in any selected domain. Our approach consists of four major steps: (i) extraction of relevant entities; (ii) gathering of domain properties; (iii) dataset generation; and (iv) model evaluation. In order to materialize this vision, tools and resources were experimented for entity linking, knowledge acquisition, classification and prompt generation, yielding valuable insights and lessons. The generation of datasets for domain specific model evaluation has successfully proved that the approach can be a future tool for evaluating and moving LLMs “black-boxes” to human-interpretable knowledge bases. © 2024 ELRA Language Resource Association: CC BY-NC 4.0. |
Mike
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1788
-
Luo 2024
REASONING ON GRAPHS: FAITHFUL AND INTERPRETABLE LARGE LANGUAGE MODEL REASONING
12th International Conference on Learning Representations, ICLR 2024 2024;(): International Conference on Learning Representations, ICLR 2024 Ref ID: 4606 Large language models (LLMs) have demonstrated impressive reasoning abilities in complex tasks. However, they lack up-to-date knowledge and experience hallucinations during reasoning, which can lead to incorrect reasoning processes and diminish their performance and trustworthiness. Knowledge graphs (KGs), which capture vast amounts of facts in a structured format, offer a reliable source of knowledge for reasoning. Nevertheless, existing KG-based LLM reasoning methods only treat KGs as factual knowledge bases and overlook the importance of their structural information for reasoning. In this paper, we propose a novel method called reasoning on graphs (RoG) that synergizes LLMs with KGs to enable faithful and interpretable reasoning. Specifically, we present a planning-retrieval-reasoning framework, where RoG first generates relation paths grounded by KGs as faithful plans. These plans are then used to retrieve valid reasoning paths from the KGs for LLMs to conduct faithful reasoning. Furthermore, RoG not only distills knowledge from KGs to improve the reasoning ability of LLMs through training but also allows seamless integration with any arbitrary LLMs during inference. Extensive experiments on two benchmark KGQA datasets demonstrate that RoG achieves state-of-the-art performance on KG reasoning tasks and generates faithful and interpretable reasoning results. © 2024 12th International Conference on Learning Representations, ICLR 2024. All rights reserved. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1926
-
Luo 2023
Systematic Assessment of Factual Knowledge in Large Language Models
Findings of the Association for Computational Linguistics: EMNLP 2023 2023;():13272-13286 Association for Computational Linguistics (ACL) 2023 Ref ID: 5082 Previous studies have relied on existing question-answering benchmarks to evaluate the knowledge stored in large language models (LLMs). However, this approach has limitations regarding factual knowledge coverage, as it mostly focuses on generic domains which may overlap with the pretraining data. This paper proposes a framework to systematically assess the factual knowledge of LLMs by leveraging knowledge graphs (KGs). Our framework automatically generates a set of questions and expected answers from the facts stored in a given KG, and then evaluates the accuracy of LLMs in answering these questions. We systematically evaluate the state-of-the-art LLMs with KGs in generic and specific domains. The experiment shows that ChatGPT is consistently the top performer across all domains. We also find that LLMs performance depends on the instruction finetuning, domain and question complexity and is prone to adversarial context. © 2023 Association for Computational Linguistics. |
yuexi
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3258
-
Luo 2023
ChatRule: Mining Logical Rules with Large Language Models for Knowledge Graph Reasoning
arXiv 2023;(): 2023 Ref ID: 7829 Logical rules are essential for uncovering the logical connections between relations, which could improve reasoning performance and provide interpretable results on knowledge graphs (KGs). Although there have been many efforts to mine meaningful logical rules over KGs, existing methods suffer from computationally intensive searches over the rule space and a lack of scalability for large-scale KGs. Besides, they often ignore the semantics of relations which is crucial for uncovering logical connections. Recently, large language models (LLMs) have shown impressive performance in the field of natural language processing and various applications, owing to their emergent ability and generalizability. In this paper, we propose a novel framework, ChatRule, unleashing the power of large language models for mining logical rules over knowledge graphs. Specifically, the framework is initiated with an LLM-based rule generator, leveraging both the semantic and structural information of KGs to prompt LLMs to generate logical rules. To refine the generated rules, a rule ranking module estimates the rule quality by incorporating facts from existing KGs. Last, the ranked rules can be used to conduct reasoning over KGs. ChatRule is evaluated on four large-scale KGs, w.r.t. different rule quality metrics and downstream tasks, showing the effectiveness and scalability of our method. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3930
-
Luo 2023
Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical Reasoning Capabilities of Language Models
arXiv 2023;(): 2023 Ref ID: 7860 Logical reasoning is fundamental for humans yet presents a substantial challenge in the domain of Artificial Intelligence. Initially, researchers used Knowledge Representation and Reasoning (KR) systems that did not scale and required non-trivial manual effort. Recently, the emergence of large language models (LLMs) has demonstrated the ability to overcome various limitations of formal Knowledge Representation (KR) systems. Consequently, there's a growing interest in using LLMs for logical reasoning via natural language. This work strives to understand the proficiency of LLMs in logical reasoning by offering a brief review of the latest progress in this area; with a focus on the logical reasoning datasets, tasks, and the methods adopted to utilize LLMs for reasoning. To offer a thorough analysis, we have compiled a benchmark titled LogiGLUE. This includes 24 varied datasets encompassing deductive, abductive, and inductive reasoning. Utilizing LogiGLUE as a foundation, we have trained an instruction fine-tuned language model, resulting in LogiT5. We study single-task training, multi-task training, and "chain-of-thought" knowledge distillation fine-tuning technique to assess the performance of model across the different logical reasoning categories. We also assess various LLMs using LogiGLUE, and the findings indicate that LLMs excel most in abductive reasoning, followed by deductive reasoning, while they are least effective at inductive reasoning. We aim to shed light on the capabilities and potential pathways for enhancing logical reasoning proficiency in LLMs, paving the way for more advanced and nuanced developments in this critical field. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3114
-
Luo 2024
Bridging Gaps in Content and Knowledge for Multimodal Entity Linking
Proceedings of the 32nd ACM International Conference on Multimedia 2024;():9311–9320 Melbourne VIC, Australia Association for Computing Machinery 2024 DOI: 10.1145/3664647.3681661 · Ref ID: 7228 |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1488
-
Luo 2024
KnowLA: Enhancing Parameter-efficient Finetuning with Knowledgeable Adaptation
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2024 2024;1():7146-7159 Association for Computational Linguistics (ACL) 2024 Ref ID: 4482 Parameter-efficient finetuning (PEFT) is a key technique for adapting large language models (LLMs) to downstream tasks. In this paper, we study leveraging knowledge graph embeddings to improve the effectiveness of PEFT. We propose a knowledgeable adaptation method called KnowLA. It inserts an adaptation layer into an LLM to integrate the embeddings of entities appearing in the input text. The adaptation layer is trained in combination with LoRA on instruction data. Experiments on six benchmarks with two popular LLMs and three knowledge graphs demonstrate the effectiveness and robustness of KnowLA. We show that KnowLA can help activate the relevant parameterized knowledge in an LLM to answer a question without changing its parameters or input prompts. © 2024 Association for Computational Linguistics. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1084
-
Lv 2024
Coarse-to-Fine Highlighting: Reducing Knowledge Hallucination in Large Language Models
Proceedings of Machine Learning Research 2024;235():33594-33623 ML Research Press 2024 Ref ID: 4367 Generation of plausible but incorrect factual information, often termed hallucination, has attracted significant research interest. Retrieval-augmented language model (RALM)-which enhances models with up-to-date knowledge-emerges as a promising method to reduce hallucination. However, existing RALMs may instead exacerbate hallucination when retrieving lengthy contexts. To address this challenge, we propose COFT, a novel COarse-to-Fine highlighTing method to focus on different granularity-level key texts, thereby avoiding getting lost in lengthy contexts. Specifically, COFT consists of three components: recaller, scorer, and selector. First, recaller applies a knowledge graph to extract potential key entities in a given context. Second, scorer measures the importance of each entity by calculating its contextual weight. Finally, selector selects high contextual weight entities with a dynamic threshold algorithm and highlights the corresponding paragraphs, sentences, or words in a coarse-to-fine manner. Extensive experiments on the knowledge hallucination benchmark demonstrate the effectiveness of COFT, leading to a superior performance over 30% in the F1 score metric. Moreover, COFT also exhibits remarkable versatility across various long-form tasks, such as reading comprehension and question answering. Copyright 2024 by the author(s) |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#481
-
Lysyuk 2024
Konstruktor: A Strong Baseline for Simple Knowledge Graph Question Answering
2nd International Conference on Engineering Manufacture (EM) 2024;():107-118 Porto, PORTUGAL Springer International Publishing Ag 2024 DOI: 10.1007/978-3-031-70242-6_11 · Ref ID: 2987 While being one of the most popular question types, simple questions such as "Who is the author of Cinderella?", are still not completely solved. Surprisingly, even most powerful modern Large Language Models (LLMs) are prone to errors when dealing with such questions, especially when dealing with rare entities. At the same time, as an answer may be one hop away from the question entity, one can try to develop a method that uses structured knowledge graphs (KGs) to answer such questions. In this paper, we introduce Konstruktor -- an efficient and robust approach that breaks down the problem into three steps: (i) entity extraction and entity linking, (ii) relation prediction, and (iii) querying the knowledge graph. Our approach integrates language models and knowledge graphs, exploiting the power of the former and the interpretability of the latter. We experiment with two named entity recognition and entity linking methods and several relation detection techniques. We show that for relation detection, the most challenging step of the workflow, a combination of relation classification/generation and ranking outperforms other methods. On four datasets, we report the strong performance of Konstruktor. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3473
-
Lyu 2024
GP-GPT: Large Language Model for Gene-Phenotype Mapping
arXiv 2024;(): 2024 Ref ID: 8600 Pre-trained large language models(LLMs) have attracted increasing attention in biomedical domains due to their success in natural language processing. However, the complex traits and heterogeneity of multi-sources genomics data pose significant challenges when adapting these models to the bioinformatics and biomedical field. To address these challenges, we present GP-GPT, the first specialized large language model for genetic-phenotype knowledge representation and genomics relation analysis. Our model is fine-tuned in two stages on a comprehensive corpus composed of over 3,000,000 terms in genomics, proteomics, and medical genetics, derived from multiple large-scale validated datasets and scientific publications. GP-GPT demonstrates proficiency in accurately retrieving medical genetics information and performing common genomics analysis tasks, such as genomics information retrieval and relationship determination. Comparative experiments across domain-specific tasks reveal that GP-GPT outperforms state-of-the-art LLMs, including Llama2, Llama3 and GPT-4. These results highlight GP-GPT's potential to enhance genetic disease relation research and facilitate accurate and efficient analysis in the fields of genomics and medical genetics. Our investigation demonstrated the subtle changes of bio-factor entities' representations in the GP-GPT, which suggested the opportunities for the application of LLMs to advancing gene-phenotype research. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3841
-
Lyu 2024
Retrieve-Plan-Generation: An Iterative Planning and Answering Framework for Knowledge-Intensive LLM Generation
arXiv 2024;(): 2024 Ref ID: 8410 Despite the significant progress of large language models (LLMs) in various tasks, they often produce factual errors due to their limited internal knowledge. Retrieval-Augmented Generation (RAG), which enhances LLMs with external knowledge sources, offers a promising solution. However, these methods can be misled by irrelevant paragraphs in retrieved documents. Due to the inherent uncertainty in LLM generation, inputting the entire document may introduce off-topic information, causing the model to deviate from the central topic and affecting the relevance of the generated content. To address these issues, we propose the Retrieve-Plan-Generation (RPG) framework. RPG generates plan tokens to guide subsequent generation in the plan stage. In the answer stage, the model selects relevant fine-grained paragraphs based on the plan and uses them for further answer generation. This plan-answer process is repeated iteratively until completion, enhancing generation relevance by focusing on specific topics. To implement this framework efficiently, we utilize a simple but effective multi-task prompt-tuning method, enabling the existing LLMs to handle both planning and answering. We comprehensively compare RPG with baselines across 5 knowledge-intensive generation tasks, demonstrating the effectiveness of our approach. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1460
-
Ma 2023
KAPALM: Knowledge grAPh enhAnced Language Model for Fake News Detection
Findings of the Association for Computational Linguistics: EMNLP 2023 2023;():3999-4009 Association for Computational Linguistics (ACL) 2023 Ref ID: 5058 Social media has not only facilitated news consumption, but also led to the wide spread of fake news. Because news articles in social media are usually condensed and full of knowledge entities, existing methods of fake news detection use external entity knowledge to improve the effectiveness. However, the majority of these methods focus on news entity information and ignore the structured relation knowledge among news entities. To address this issue, in this work, we propose a Knowledge grAPh enhAnced Language Model (KAPALM) which is a novel model that fuses coarse- and fine-grained representations of entity knowledge from Knowledge Graphs (KGs). Firstly, we identify entities in news content and link them to entities in KGs. Then, a subgraph of KGs is extracted to provide structured relation knowledge of entities in KGs and fed into a graph neural network to obtain the coarse-grained knowledge representation. This subgraph is pruned to provide fine-grained knowledge and fed into the attentive graph pooling layer. Finally, we integrate the coarse- and fine-grained entity knowledge representations with the representation of news content for fake news detection. The experimental results on two benchmark datasets show that our method is superior to state-of-the-art baselines in the full-scale setting. In addition, our model is competitive in the few-shot setting. © 2023 Association for Computational Linguistics. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#731
-
Ma 2024
A review of graph neural networks and pretrained language models for knowledge graph reasoning
Knowledge Graph (KG) stores human knowledge facts in an intuitive graphical structure but faces challenges such as incomplete construction or inability to handle new knowledge. Knowledge Graph Reasoning (KGR) can make KGs more accurate, complete, and trustworthy to support various artificial intelligence applications better. Currently, the popular KGR methods are based on graph neural networks (GNNs). Recent studies have shown that hybrid logic rules and synergized pre-trained language models (PLMs) can enhance the GNN-based KGR methods. These methods mainly focus on data sparsity, insufficient knowledge evolution patterns, multi- modal fusion, and few-shot reasoning. Although many studies have been conducted, there are still few review papers that comprehensively summarize and explore KGR methods related to GNNs, logic rules, and PLMs. Therefore, this paper provides a comprehensive review of GNNs and PLMs for KGR based on a large number of high-quality papers. To present a clear overview of KGR, we propose a general framework. Specifically, we first introduce the KG preparation. Then we provide an overview of KGR methods, in which we categorize KGR methods into GNNs-based, logic rules-enhanced, and pre-trained language models-enhanced KGR methods. Furthermore, we also compare and analyze the GNN-based KGR methods in two scenarios. Moreover, we also present the application of KGR in different fields. Finally, we discuss the current challenges and future research directions for KGR. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#628
-
Ma 2023
Ontology-Based BERT Model for Automated Information Extraction from Geological Hazard Reports
Geological knowledge can provide support for knowledge discovery, knowledge inference and mineralization predictions of geological big data. Entity identification and relationship extraction from geological data description text are the key links for constructing knowledge graphs. Given the lack of publicly annotated datasets in the geology domain, this paper illustrates the construction process of geological entity datasets, defines the types of entities and interconceptual relationships by using the geological entity concept system, and completes the construction of the geological corpus. To address the shortcomings of existing language models (such as Word2vec and Glove) that cannot solve polysemous words and have a poor ability to fuse contexts, we propose a geological named entity recognition and relationship extraction model jointly with Bidirectional Encoder Representation from Transformers (BERT) pretrained language model. To effectively represent the text features, we construct a BERT- bidirectional gated recurrent unit network (BiGRU)-conditional random field (CRF)-based architecture to extract the named entities and the BERT-BiGRU-Attention-based architecture to extract the entity relations. The results show that the F1-score of the BERT-BiGRU-CRF named entity recognition model is 0.91 and the F1-score of the BERT-BiGRU-Attention relationship extraction model is 0.84, which are significant performance improvements when compared to classic language models (e.g., word2vec and Embedding from Language Models (ELMo)). |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1022
-
Ma 2023
BERT-based Question Answering using Knowledge Graph Embeddings in Nuclear Power Domain
Proceedings of the 2023 26th International Conference on Computer Supported Cooperative Work in Design, CSCWD 2023 2023;():267-272 Institute of Electrical and Electronics Engineers Inc. 2023 DOI: 10.1109/CSCWD57460.2023.10152692 · Ref ID: 5266 In order to improve the resource utilization rate of existing nuclear power data and promote workers to efficiently obtain the operation information of nuclear power units and assist them in fault diagnosis and maintenance decision-making, this paper constructs a knowledge graph question answering (KGQA) dataset in the field of nuclear power. The BEm-KGQA model based on the pre-trained language model and knowledge graph embedding method was proposed. Our model learns the embedded representation of the knowledge graph through BERT and fine-tunes the BERT model. In the question embedding stage, it learns the embedded representation of the question based on the fine-tuned BERT model. Through experiments, we demonstrate the effectiveness of the method over other models. In addition, this paper implements a nuclear power question answering system. Based on the question answering system, employees can learn about unit information and efficiently obtain information on unusual operating events of nuclear power. © 2023 IEEE. |
Kwesi
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3535
-
Madhusudhana 2024
Integrating Cognitive AI with Generative Models for Enhanced Question Answering in Skill-based Learning
arXiv 2024;(): 2024 Ref ID: 8489 In online learning, the ability to provide quick and accurate feedback to learners is crucial. In skill-based learning, learners need to understand the underlying concepts and mechanisms of a skill to be able to apply it effectively. While videos are a common tool in online learning, they cannot comprehend or assess the skills being taught. Additionally, while Generative AI methods are effective in searching and retrieving answers from a text corpus, it remains unclear whether these methods exhibit any true understanding. This limits their ability to provide explanations of skills or help with problem-solving. This paper proposes a novel approach that merges Cognitive AI and Generative AI to address these challenges. We employ a structured knowledge representation, the TMK (Task-Method-Knowledge) model, to encode skills taught in an online Knowledge-based AI course. Leveraging techniques such as Large Language Models, Chain-of-Thought, and Iterative Refinement, we outline a framework for generating reasoned explanations in response to learners' questions about skills. |
Ishan
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2741
-
Mainetti 2015
A novel rule-based semantic architecture for IoT building automation systems
2015 23rd International Conference on Software, Telecommunications and Computer Networks (SoftCOM) 2015;():124-131 2015 DOI: 10.1109/SOFTCOM.2015.7314063 · Ref ID: 6974 The ever growing number of smart devices connected to the Internet of Things is giving users the chance to sense data from surrounding environment and act upon it. However, interpreting raw data coming from heterogeneous sensors and applying control algorithms to actuators is not a simple task for the common end-user who wants to create applications for smart environments. For these reasons, this work deals with the definition of a novel rule-based semantic architecture for the implementation of building automation applications in an IoT context. Sensor data are abstracted at a high semantic level related to the properties they are associated to and interactions with actuators are driven by high-level desired actions. Applications have the form of an Event-Condition-Action (ECA) rule and the layered architecture separates high-level semantic reasoning aspects from low-level execution details. The proposed architecture is also compared with main state-of-the-art solutions and some suitable technologies for its implementation are suggested. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#125
-
Malaviya 2020
Commonsense Knowledge Base Completion with Structural and Semantic Context
34th AAAI Conference on Artificial Intelligence / 32nd Innovative Applications of Artificial Intelligence Conference / 10th AAAI Symposium on Educational Advances in Artificial Intelligence 2020;34():2925-2933 New York, NY Assoc Advancement Artificial Intelligence 2020 Ref ID: 3109 Automatic KB completion for commonsense knowledge graphs (e.g., ATOMIC and ConceptNet) poses unique challenges compared to the much studied conventional knowledge bases (e.g., Freebase). Commonsense knowledge graphs use free formtext to represent nodes, resulting in orders of magnitude more nodes compared to conventional KBs 18x more nodes in ATOMIC compared to Freebase (FB15K237)). Importantly, this implies significantly sparser graph structures a major challenge for existing KB completion methods that assume densely connected graphs over a relatively smaller set of nodes. in this paper, we present novel KB completion models that can address these challenges by exploiting the structural and semantic context of nodes. Specifically, we investigate two key ideas: (1) learning from local graph structure, using graph convolutional networks and automatic graph densification and (2) transfer learning from pre -trained language models to knowledge graphs for enhanced contextual representation of knowledge. We describe our method to incorporate information from both these sources in a joint model and provide the first empirical results for KB completion on ATOMIC and evaluation with ranking metrics on ConceptNet. Our results demonstrate the effectiveness of language model representations in boosting link prediction performance and the advantages of learning from local graph structure (+1.5 points in MRR for ConceptNet) when training on subgraphs for computational efficiency. Further analysis on model predictions shines light on the types of commonsense knowledge that language models capture well.' |
Davis
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2760
-
Malik 2017
Ontology based context aware model
2017 International Conference on Computational Intelligence in Data Science(ICCIDS) 2017;():1-6 2017 DOI: 10.1109/ICCIDS.2017.8272632 · Ref ID: 6297 Top-Down approach is followed while developing a Context Model which means first application and then its functionality will be defined. Once this is setup, required context models are developed. In this paper, we have presented the comparison of existing context ontologies on the basis of different parameters and also provided an ontology based context model which defines the generic concepts and also provides the extensibility for adding domain-specific ontology. This model will use Extended Hierarchical Censored Production rule (EHCPR) which is a scheme for representing knowledge. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#825
-
Maratsi 2024
Towards Cross-Domain Linking of Data: A Semantic Mapping of Cultural Heritage Ontologies
25th Annual International Conference on Digital Government Research (DGO) - Internet of Beings - Transforming Public Governance 2024;():165-176 Taipei, TAIWAN Assoc Computing Machinery 2024 DOI: 10.1145/3657054.3657077 · Ref ID: 3751 The Linked Open Vocabularies (LOV) registry, designed with the Linked Data principles at core, provides an environment suitable for research which targets domain-specific, but also potentially reusable, information representation. The main purpose of this study is to follow the recommendations pertaining to the utilisation of LOV as a basis for experimentation in order to examine how information within the Cultural Heritage (CH) domain can be improved in terms of reusability and interoperability. The present lack of cross-domain knowledge transfer forms the motivation behind this study, with the aim of facilitating the transition from conventional, domain-specific knowledge representation to reusable and semantically interoperable information. The methodology of this study involves the manual semantic mapping of elements from 12 vocabularies in the LOV registry, reinforced by a small-scale experiment using contemporary large language models (LLMs), particularly GPT, for a preliminary assessment of the mapping process. The findings revealed several key aspects to consider regarding the alignment of semantically adjacent vocabulary elements in the CH domain and beyond, emphasising the potential unveiled by linking domain-focused schemata to standardised, established ones while preserving the conceptual hierarchies inherent to each individual knowledge domain. The contribution of this research pertains to the vision of linking data across different domains by initiating the alignment among representation schemata in CH, with the ultimate aim to expand beyond the boundaries of the in-word knowledge domain, while employing combinatory methodological approaches of technological means and human expertise to facilitate this process. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3356
-
Marjanović 2024
DYNAMICQA: Tracing Internal Knowledge Conflicts in Language Models
arXiv 2024;(): 2024 Ref ID: 8480 Knowledge-intensive language understanding tasks require Language Models (LMs) to integrate relevant context, mitigating their inherent weaknesses, such as incomplete or outdated knowledge. However, conflicting knowledge can be present in the LM's parameters, termed intra-memory conflict, which can affect a model's propensity to accept contextual knowledge. To study the effect of intra-memory conflict on an LM's ability to accept relevant context, we utilize two knowledge conflict measures and a novel dataset containing inherently conflicting data, DynamicQA. This dataset includes facts with a temporal dynamic nature where facts can change over time and disputable dynamic facts, which can change depending on the viewpoint. DynamicQA is the first to include real-world knowledge conflicts and provide context to study the link between the different types of knowledge conflicts. We also evaluate several measures on their ability to reflect the presence of intra-memory conflict: semantic entropy and a novel coherent persuasion score. With our extensive experiments, we verify that LMs exhibit a greater degree of intra-memory conflict with dynamic facts compared to facts that have a single truth value. Furthermore, we reveal that facts with intra-memory conflict are harder to update with context, suggesting that retrieval-augmented generation will struggle with the most commonly adapted facts. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#620
-
Martin-Moncunill 2022
On Contrasting YAGO with GPT-J: An Experiment for Person-Related Attributes
4th Iberoamerican Conference and 3rd Indo-American Conference Knowledge Graphs and Semantic Web Conference (KGSWC) 2022;1686():234-245 Madrid, SPAIN Springer International Publishing Ag 2022 DOI: 10.1007/978-3-031-21422-6_17 · Ref ID: 3351 Language models (LMs) trained or large text corpora have demonstrated their superior performance in different language related tasks in the last years. These models automatically implicitly incorporate factual knowledge that can be used to complement existing Knowledge Graphs (KGs) that in most cases are structured from human curated databases. Here we report an experiment that attempts to gain insights about the extent to which LMs can generate factual information as that present in KGs. Concretely, we have tested such process using the English Wikipedia subset of YAGO and the GPT-J model for attributes related to individuals. Results show that the generation of correct factual information depends on the generation parameters of the model and are unevenly balanced across diverse individuals. Further, the LM can be used to populate further factual information, but it requires intermediate parsing to correctly map to KG attributes. |
Davis
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1912
-
Martinez 2023
Study-Buddy: A Knowledge Graph-Powered Learning Companion for School Students
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2023;13998 LNCS():133-137 Springer Science and Business Media Deutschland GmbH 2023 DOI: 10.1007/978-3-031-43458-7_25 · Ref ID: 5157 Large Language Models (LLMs) have the potential to substantially improve educational tools for students. However, they face limitations, including factual accuracy, personalization, and the lack of control over the sources of information. This paper presents Study-Buddy, a prototype of a conversational AI assistant for school students to address the above-mentioned limitations. Study-Buddy embodies an AI assistant based on a knowledge graph, LLMs models, and computational persuasion. It is designed to support educational campaigns as a hybrid AI solution. The demonstrator showcases interactions with Study-Buddy and the crucial role of the Knowledge Graph for the bot to present the appropriate activities to the students. A video demonstrating the main features of Study-Buddy is available at: https://youtu.be/DHPTsN1RI9o. © The Author(s), under exclusive license to Springer Nature Switzerland AG. 2023. |
Mike
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1734
-
Maushagen 2024
Populating CSV Files from Unstructured Text with LLMs for KG Generation with RML
CEUR Workshop Proceedings 2024;3759(): CEUR-WS 2024 Ref ID: 4249 We report on an exploratory study using Large Language Models (LLMs) to generate Comma-Separated Values (CSV) files, which are subsequently transformed into Resource Description Framework (RDF) using the RDF Mapping Language (RML).Prior studies have shown that LLMs sometimes have problems generating valid and well-formed RDF from unstructured texts, i.e., issues with RDF, not the contents.We wanted to test whether the generation of CSV led to fewer issues and whether this would be a viable option for allowing domain experts to be actively part of the Knowledge Graph (KG) population process by allowing them to use familiar tools.We have built a prototype illustrating this idea, and the results seem promising for further study.The initial prototype uses zero-shot training and is built on GPT-4.The prototype takes the unstructured text and the CSV file's structure as input and uses the latter to generate prompts to fill in the cells' values.Future work includes analyzing the effect of different prompting strategies.The limitation, however, is that such an approach only works for projects where domain experts work with spreadsheets for pre-existing mappings. © 2024 Copyright for this paper by its authors. |
Ishan
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#763
-
Mavromatis 2024
SemPool: Simple, Robust, and Interpretable KG Pooling for Enhancing Language Models
28th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) 2024;14648():154-166 Taipei, TAIWAN Springer-Verlag Singapore Pte Ltd 2024 DOI: 10.1007/978-981-97-2238-9_12 · Ref ID: 3346 Knowledge Graph (KG) powered question answering (QA) performs complex reasoning over language semantics as well as knowledge facts. Graph Neural Networks (GNNs) learn to aggregate information from the underlying KG, which is combined with Language Models (LMs) for effective reasoning with the given question. However, GNN-based methods for QA rely on the graph information of the candidate answer nodes, which limits their effectiveness in more challenging settings where critical answer information is not included in the KG. We propose a simple graph pooling approach that learns useful semantics of the KG that can aid the LM's reasoning and that its effectiveness is robust under graph perturbations. Our method, termed SemPool, represents KG facts with pre-trained LMs, learns to aggregate their semantic information, and fuses it at different layers of the LM. Our experimental results show that SemPool outperforms state-of-the-art GNN-based methods by 2.27% accuracy points on average when answer information is missing from the KG. In addition, SemPool offers interpretability on what type of graph information is fused at different LM layers. |
Mike
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2403
-
Mehrabi 2013
Event Causality Identification Using Conditional Random Field in Geriatric Care Domain
2013 12th International Conference on Machine Learning and Applications 2013;1():339-343 2013 DOI: 10.1109/ICMLA.2013.69 · Ref ID: 6609 Event extraction is a key step in many text-mining applications such as question-answering, information extraction and summarization systems. In this study we used conditional random field (CRF) to extract causal events from PubMed articles related to Geriatric care. Abstracts of geriatric care domain were manually reviewed and categorized into 42 different sub domains. There are a total of 19, 677 sentences in the collected abstracts from PubMed, out of which 2, 856 sentences were selected and manually annotated with cause and effect events. The data set was then divided into training (2, 520), validation (252) and test (84) sentence sets. Features such as tokens, token categories, affixes, part of speech and shallow parser were used as inputs to the CRF model. A window of features before and after each token was used to determine its causal event label using CRF. A window of four features had the best performance with 84.6% precision, 87% recall, 85% and F-measure. |
Mike
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2367
-
Mei 2009
An E-negotiation Model Based on Multi-agent and Ontology
2009 International Conference on Computational Intelligence and Natural Computing 2009;2():107-110 2009 DOI: 10.1109/CINC.2009.263 · Ref ID: 6379 In the e-commerce environments, multi-agent system is often used in automated negotiation. But when agents communicate they do not necessarily use the same vocabulary or ontology. If they want to interact successfully they must find correspondences between the terms used in their Ontology. It is obvious that negotiating agent architectures have not been addressed sufficiently. Towards this end, this paper present a novel agent construction model that enables agents communicate in the semantic web. Semantic web use Ontology to describe the negotiation protocol, which will enable agent gain the necessary knowledge of the protocol from the market. We demonstrate how the model allows us to accomplish these negotiation architectures. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#534
-
Meyer 2023
LLM-assisted Knowledge Graph Engineering: Experiments with ChatGPT
1st Working Conference on Artificial Intelligence Development for a Resilient and Sustainable Tomorrow (AIDRST) 2023;():103-115 Leipzig, GERMANY Springer Vieweg Verlag 2023 DOI: 10.1007/978-3-658-43705-3_8 · Ref ID: 3006 Knowledge Graphs (KG) provide us with a structured, flexible, transparent, cross-system, and collaborative way of organizing our knowledge and data across various domains in society and industrial as well as scientific disciplines. KGs surpass any other form of representation in terms of effectiveness. However, Knowledge Graph Engineering (KGE) requires in-depth experiences of graph structures, web technologies, existing models and vocabularies, rule sets, logic, as well as best practices. It also demands a significant amount of work. Considering the advancements in large language models (LLMs) and their interfaces and applications in recent years, we have conducted comprehensive experiments with ChatGPT to explore its potential in supporting KGE. In this paper, we present a selection of these experiments and their results to demonstrate how ChatGPT can assist us in the development and management of KGs. |
Ishan
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2891
-
Miguelañez 2011
Semantic Knowledge-Based Framework to Improve the Situation Awareness of Autonomous Underwater Vehicles
IEEE Transactions on Knowledge and Data Engineering 2011;23(5):759-773 2011 DOI: 10.1109/TKDE.2010.46 · Ref ID: 6097 This paper proposes a semantic world model framework for hierarchical distributed representation of knowledge in autonomous underwater systems. This framework aims to provide a more capable and holistic system, involving semantic interoperability among all involved information sources. This will enhance interoperability, independence of operation, and situation awareness of the embedded service-oriented agents for autonomous platforms. The results obtained specifically affect the mission flexibility, robustness, and autonomy. The presented framework makes use of the idea that heterogeneous real-world data of very different type must be processed by (and run through) several different layers, to be finally available in a suited format and at the right place to be accessible by high-level decision-making agents. In this sense, the presented approach shows how to abstract away from the raw real-world data step by step by means of semantic technologies. The paper concludes by demonstrating the benefits of the framework in a real scenario. A hardware fault is simulated in a REMUS 100 AUV while performing a mission. This triggers a knowledge exchange between the status monitoring agent and the adaptive mission planner embedded agent. By using the proposed framework, both services can interchange information while remaining domain independent during their interaction with the platform. The results of this paper are readily applicable to land and air robotics. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3035
-
Milea 2012
tOWL: A Temporal Web Ontology Language
IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 2012;42(1):268-281 2012 DOI: 10.1109/TSMCB.2011.2162582 · Ref ID: 6242 Through its interoperability and reasoning capabilities, the Semantic Web opens a realm of possibilities for developing intelligent systems on the Web. The Web Ontology Language (OWL) is the most expressive standard language for modeling ontologies, the cornerstone of the Semantic Web. However, up until now, no standard way of expressing time and time-dependent information in OWL has been provided. In this paper, we present a temporal extension of the very expressive fragment SHIN(D) of the OWL Description Logic language, resulting in the temporal OWL language. Through a layered approach, we introduce three extensions: 1) concrete domains, which allow the representation of restrictions using concrete domain binary predicates; 2) temporal representation , which introduces time points, relations between time points, intervals, and Allen's 13 interval relations into the language; and 3) timeslices/fluents, which implement a perdurantist view on individuals and allow for the representation of complex temporal aspects, such as process state transitions. We illustrate the expressiveness of the newly introduced language by using an example from the financial domain. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#420
-
Miller 2023
Knowledge Enhanced Deep Learning: Application to Pandemic Prediction
IEEE 9th International Conference on Collaboration and Internet Computing (CIC) 2023;():42-51 Atlanta, GA Ieee 2023 DOI: 10.1109/cic58953.2023.00016 · Ref ID: 3472 Deep Learning has been successfully applied to many problem domains, yet its advantages have been slow to emerge for time series forecasting. For example, in the well-known M Competitions, until recently, hybrids of traditional statistical or machine learning (e.g., gradient boosting) techniques were the top performers. With the recent architectural advances in deep learning being applied to time series forecasting, such as encoder-decoders with attention, transformers, representation learning, and graph neural networks, deep learning has begun to show its advantages. Still, in the area of pandemic prediction, there remain challenges for deep learning models: the time series is not long enough for effective training, ignorance of accumulated scientific knowledge, and interpretability of the model. Today, there is a vast amount of knowledge available that deep learning models can tap into, including Knowledge Graphs and Large Language Models fine-tuned with scientific domain knowledge. There is ongoing research examining how to utilize or inject knowledge into deep learning models. The state-of-the-art approaches are reviewed and suggestions for further work are provided. Recommendations for how this can be applied to future pandemics are given. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#230
-
Mimouni 2019
Entity Embedding Analogy for Implicit Link Discovery
16th International Extended Semantic Web Conference (ESWC) 2019;11762():126-129 Portoroz, SLOVENIA Springer International Publishing Ag 2019 DOI: 10.1007/978-3-030-32327-1_25 · Ref ID: 3245 In this work we are interested in the problem of knowledge graph (KG) incompleteness, which we propose to solve by discovering implicit triples using observed ones in the incomplete graph leveraging analogy structures deducted from a KG embedding model. We use a language modelling approach that we adapt to entities and relations. The first results show that analogical inferences in the projected vector space is relevant to a link prediction task. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1713
-
Miranda-Escalada 2020
Overview of automatic clinical coding: annotations, guidelines, and solutions for non-English clinical cases at CodiEsp track of CLEF eHealth 2020
CEUR Workshop Proceedings 2020;2696(): CEUR-WS 2020 Ref ID: 5729 Clinical coding requires the analysis and transformation of medical narratives into a structured or coded format using internationally recognized classification systems like ICD-10. These codes represent medical diagnoses and procedures. Clinical coding is critical for standardizing medical records, particularly for health information management systems used to carry out biomedical/epidemiological research studies, monitor health trends or facilitate medical billing and reimbursement. The growing amount of clinical records has prompted the search for tools that assist manual coding. Inspired by the CCMC challenge and various eHealth CLEF shared tasks, we organized the CodiEsp track. Codiesp (eHealth CLEF 2020-Multilingual Information Extraction Shared Task) represents the first effort to promote the development and evaluation of automatic clinical coding systems for medical documents in Spanish. In this context, we have published a set of resources including (i) a manually coded Gold Standard corpus with inter-coder agreement and supporting textual evidence statements, (ii) an additional large collection of medical literature indexed with ICD-10 clinical codes and (iii) a machine translated corpus to enable multilingual approaches and testing of previous strategies developed for data in English. We have received a total of 168 runs submitted by 22 teams from 11 countries for at least one of our three sub-tracks: CodiEsp-D (Diagnosis Coding), CodiEsp-P (Procedure Coding) and CodiEsp-X (Explainable AI). Despite the considerable complexity of this task, which can be viewed as a hierarchical multi-label classification problem using ICD-10 codes as labels and documents as input, participants obtained very promising results, specially for codes that were well covered by the training data. Participants examined a variety of strategies, specifically deep learning approaches, pre-trained language models and word embeddings (BERT, BETO, FastText, etc.), as well as NER, string lookup and knowledge graph approaches. CodiEsp Corpus: https://zenodo.org/record/3837305. Copyright © 2020 for this paper by its authors. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3682
-
Mitra 2024
LOCALINTEL: Generating Organizational Threat Intelligence from Global and Local Cyber Knowledge
arXiv 2024;(): 2024 Ref ID: 8040 Security Operations Center (SoC) analysts gather threat reports from openly accessible global threat databases and customize them manually to suit a particular organization's needs. These analysts also depend on internal repositories, which act as private local knowledge database for an organization. Credible cyber intelligence, critical operational details, and relevant organizational information are all stored in these local knowledge databases. Analysts undertake a labor intensive task utilizing these global and local knowledge databases to manually create organization's unique threat response and mitigation strategies. Recently, Large Language Models (LLMs) have shown the capability to efficiently process large diverse knowledge sources. We leverage this ability to process global and local knowledge databases to automate the generation of organization-specific threat intelligence. In this work, we present LOCALINTEL, a novel automated knowledge contextualization system that, upon prompting, retrieves threat reports from the global threat repositories and uses its local knowledge database to contextualize them for a specific organization. LOCALINTEL comprises of three key phases: global threat intelligence retrieval, local knowledge retrieval, and contextualized completion generation. The former retrieves intelligence from global threat repositories, while the second retrieves pertinent knowledge from the local knowledge database. Finally, the fusion of these knowledge sources is orchestrated through a generator to produce a contextualized completion. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3444
-
Mohammadjafari 2024
From Natural Language to SQL: Review of LLM-based Text-to-SQL Systems
arXiv 2024;(): 2024 Ref ID: 8646 Since the onset of LLMs, translating natural language queries to structured SQL commands is assuming increasing. Unlike the previous reviews, this survey provides a comprehensive study of the evolution of LLM-based text-to-SQL systems, from early rule-based models to advanced LLM approaches, and how LLMs impacted this field. We discuss benchmarks, evaluation methods and evaluation metrics. Also, we uniquely study the role of integration of knowledge graphs for better contextual accuracy and schema linking in these systems. The current techniques fall into two categories: in-context learning of corpus and fine-tuning, which then leads to approaches such as zero-shot, few-shot learning from the end, and data augmentation. Finally, we highlight key challenges such as computational efficiency, model robustness, and data privacy with perspectives toward their development and improvements in potential areas for future of LLM-based text-to-SQL system. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1080
-
Mohsenimofidi 2024
Classifying User Intent for Effective Prompt Engineering: A Case of a Chatbot for Startup Teams
Generative AI for Effective Softw. Development 2024;():317-329 Springer Nature 2024 DOI: 10.1007/978-3-031-55642-5_15 · Ref ID: 4232 Prompt engineering plays a pivotal role in effective interaction with large language models (LLMs), including ChatGPT. Understanding user intent behind interactions with LLMs is an important part of prompt construction to elicit relevant and meaningful responses from them. Existing literature sheds little light on this aspect of prompt engineering. Our study seeks to address this knowledge gap. Using the example of building a chatbot for startup teams to obtain better responses from ChatGPT, we demonstrate a feasible way of classifying user intent automatically using ChatGPT itself. Our study contributes to a rapidly increasing body of knowledge of prompt engineering for LLMs. Even though the application domain of our approach is startups, it can be adapted to support effective prompt engineering in various other application domains as well. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024. |
Srividya
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#776
-
Moiseev 2022
SKILL: Structured Knowledge Infusion for Large Language Models
Conference of the North-American-Chapter-of-the-Association-for-Computational-Linguistics (NAAACL) - Human Language Technologies 2022;():1581-1588 Seattle, WA Assoc Computational Linguistics-Acl 2022 Ref ID: 3168 Large language models (LLMs) have demonstrated human-level performance on a vast spectrum of natural language tasks. However, it is largely unexplored whether they can better internalize knowledge from a structured data, such as a knowledge graph, or from text. In this work, we propose a method to infuse structured knowledge into LLMs, by directly training T5 models on factual triples of knowledge graphs (KGs). We show that models pre-trained on Wikidata KG with our method outperform the T5 baselines on FreebaseQA and WikiHop, as well as the Wikidata-answerable subset of TriviaQA and NaturalQuestions. The models pretrained on factual triples compare competitively with the ones on natural language sentences that contain the same knowledge. Trained on a smaller size KG, WikiMovies, we saw 3x improvement of exact match score on MetaQA task compared to T5 baseline. The proposed method has an advantage that no alignment between the knowledge graph and text corpus is required in curating training data. This makes our method particularly useful when working with industry-scale knowledge graphs. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1691
-
Moses 2024
NLPeople at TextGraphs-17 Shared Task: Chain of Thought Questioning to Elicit Decompositional Reasoning
TextGraphs at ACL 2024 - Proceedings of TextGraphs-17: Graph-Based Methods for Natural Language Processing, 62nd Annual Meeting of the Association of Computational Linguistics 2024;():142-148 Association for Computational Linguistics (ACL) 2024 Ref ID: 4247 This paper presents the approach of the NLPeople team for the Text-Graph Representations for KGQA Shared Task at TextGraphs-17 (Sakhovskiy et al., 2024). The task involved selecting an answer for a given question from a list of candidate entities. We show that prompting Large Language models (LLMs) to break down a natural language question into a series of sub-questions, allows models to understand complex questions. The LLMs arrive at the final answer by answering the intermediate questions using their internal knowledge without needing additional context. Our approach to the task uses an ensemble of prompting strategies to guide how LLMs interpret various types of questions. Our submission achieves an F1 score of 85.90, ranking 1st among the other participants in the task. © 2024 Association for Computational Linguistics. |
Kwesi
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1130
-
Mousavi 2024
Construction of Paired Knowledge Graph - Text Datasets Informed by Cyclic Evaluation
2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():3782-3803 European Language Resources Association (ELRA) 2024 Ref ID: 4535 Datasets that pair Knowledge Graphs (KG) and text together (KG-T) can be used to train forward and reverse neural models that generate text from KG and vice versa. However models trained on datasets where KG and text pairs are not equivalent can suffer from more hallucination and poorer recall. In this paper, we verify this empirically by generating datasets with different levels of noise and find that noisier datasets do indeed lead to more hallucination. We argue that the ability of forward and reverse models trained on a dataset to cyclically regenerate source KG or text is a proxy for the equivalence between the KG and the text in the dataset. Using cyclic evaluation we find that manually created WebNLG is much better than automatically created TeKGen and T-REx. Informed by these observations, we construct a new, improved dataset called LAGRANGE using heuristics meant to improve equivalence between KG and text and show the impact of each of the heuristics on cyclic evaluation. We also construct two synthetic datasets using large language models (LLMs), and observe that these are conducive to models that perform significantly well on cyclic generation of text, but less so on cyclic generation of KGs, probably because of a lack of a consistent underlying ontology. © 2024 ELRA Language Resource Association: CC BY-NC 4.0. |
Davis
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3354
-
Mousavi 2024
DyKnow: Dynamically Verifying Time-Sensitive Factual Knowledge in LLMs
arXiv 2024;(): 2024 Ref ID: 8232 LLMs acquire knowledge from massive data snapshots collected at different timestamps. Their knowledge is then commonly evaluated using static benchmarks. However, factual knowledge is generally subject to time-sensitive changes, and static benchmarks cannot address those cases. We present an approach to dynamically evaluate the knowledge in LLMs and their time-sensitiveness against Wikidata, a publicly available up-to-date knowledge graph. We evaluate the time-sensitive knowledge in twenty-four private and open-source LLMs, as well as the effectiveness of four editing methods in updating the outdated facts. Our results show that 1) outdatedness is a critical problem across state-of-the-art LLMs; 2) LLMs output inconsistent answers when prompted with slight variations of the question prompt; and 3) the performance of the state-of-the-art knowledge editing algorithms is very limited, as they can not reduce the cases of outdatedness and output inconsistency. |
yuexi
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3833
-
Mruthyunjaya 2023
Rethinking Language Models as Symbolic Knowledge Graphs
arXiv 2023;(): 2023 Ref ID: 7817 Symbolic knowledge graphs (KGs) play a pivotal role in knowledge-centric applications such as search, question answering and recommendation. As contemporary language models (LMs) trained on extensive textual data have gained prominence, researchers have extensively explored whether the parametric knowledge within these models can match up to that present in knowledge graphs. Various methodologies have indicated that enhancing the size of the model or the volume of training data enhances its capacity to retrieve symbolic knowledge, often with minimal or no human supervision. Despite these advancements, there is a void in comprehensively evaluating whether LMs can encompass the intricate topological and semantic attributes of KGs, attributes crucial for reasoning processes. In this work, we provide an exhaustive evaluation of language models of varying sizes and capabilities. We construct nine qualitative benchmarks that encompass a spectrum of attributes including symmetry, asymmetry, hierarchy, bidirectionality, compositionality, paths, entity-centricity, bias and ambiguity. Additionally, we propose novel evaluation metrics tailored for each of these attributes. Our extensive evaluation of various LMs shows that while these models exhibit considerable potential in recalling factual information, their ability to capture intricate topological and semantic traits of KGs remains significantly constrained. We note that our proposed evaluation metrics are more reliable in evaluating these abilities than the existing metrics. Lastly, some of our benchmarks challenge the common notion that larger LMs (e.g., GPT-4) universally outshine their smaller counterparts (e.g., BERT). |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#220
-
Mu 2024
Enhancing Narrative Commonsense Reasoning With Multilevel Causal Knowledge
IEEE Trans. Neural Netw. Learn. Syst. 2024;():13 2024 DOI: 10.1109/tnnls.2024.3380851 · Ref ID: 3430 Narratives is an account of the unfolding of events, along with explanations of how and why these processes and events came to be. To understand narratives, causality has been proven to be especially useful. Causality manifests itself primarily at both the event and sentence levels, offering essential insights into understanding narratives. However, previous works utilize either sentence-level or event-level causalities. In this article, we devise a two-stage approach to fully exploit both levels of causal relationships. In the first stage, by devising posttraining tasks, we inject sentence-level causalities into pretrained language models (PLMs). The causal-enhanced PLMs, which carry sentence-level causalities, can be transferred to downstream tasks. In the second stage, we utilize event causalities to further refine narrative commonsense reasoning. But, the event sparsity problem brings about the difficulty to learn event representations and capture useful causal semantics. To alleviate this problem, we break down events into multiple word components, enabling the retrieval of word-word relations between these components. And it is possible to alleviate the event sparsity problem since word-word relations capture the interplays between event components. Based on the event-level causalities and the word-level relations, we construct the hierarchical knowledge graph (KG) as knowledge ground. A KG-based reasoning process is then employed for narrative commonsense reasoning. Experimental results affirm the effectiveness of our framework. |
yuexi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#827
-
Mühlenberg 2019
Towards information extraction from ISR reports for decision support using a two-stage learning-based approach
24th Conference on Open Architecture/Open Business Model Net-Centric Systems and Defense Transformation 2019;11015(): Baltimore, MD Spie-Int Soc Optical Engineering 2019 DOI: 10.1117/12.2518599 · Ref ID: 3736 The main challenge of computer linguistics is to represent the meaning of text in a computer model. Statistics based methods with manually created features have been used for more than 30 years with a divide and conquer approach to mark interesting features in free text. Around 2010, deep learning concepts found their way into the text-understanding research community. Deep learning is very attractive and easy to apply but needs massive pools of annotated and high quality data from every target domain, which is generally not available especially for the military domain. When changing the application domain one needs additional or new data to adopt the language models to the new domain. To overcome the everlasting "data problem" we chose a novel two-step approach by first using formal representations of the meaning and then applying a rule-based mapping to the target domain. As an intermediate language representation, we used abstract meaning representation (AMR) and trained a general base model. This base model was then trained with additional data from the intended domains (transfer learning) evaluating the quality of the parser with a stepwise approach in which we measured the parser performance against the amount of training data. This approach answered the question of how much data we need to get the required quality when changing an application domain. The mapping of the meaning representation to the target domain model gave us more control over specifics of the domain, which are not generally representable by a machine learning approach with self-learned feature vectors. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#728
-
Muludi 2024
Retrieval-Augmented Generation Approach: Document Question Answering using Large Language Model
Int. J. Adv. Comput. Sci. Appl. 2024;15(3):776-785 2024 Ref ID: 3682 This study introduces the Retrieval Augmented Generation (RAG) method to improve Question-Answering (QA) systems by addressing document processing in Natural Language Processing problems. It represents the latest breakthrough in applying RAG to document question and answer applications, overcoming previous QA system obstacles. RAG combines search techniques in vector store and text generation mechanism developed by Large Language Models, offering a time-efficient alternative to manual reading limitations. The research evaluates RAG's that use Generative Pre-trained Transformer 3.5 or GPT-3.5-turbo from the ChatGPT model and its impact on document data processing, comparing it with other applications. This research also provides datasets to test the capabilities of the QA document system. The proposed dataset and Stanford Question Answering Dataset (SQuAD) are used for performance testing. The study contributes theoretically by advancing methodologies and knowledge representation, supporting benchmarking in research communities. Results highlight RAG's superiority: achieving a precision of 0.74 in Recall-Oriented Understudy for Gisting Evaluation (ROUGE) testing, outperforming others at 0.5; obtaining an F1 score of 0.88 in BERTScore, surpassing other QA apps at 0.81; attaining a precision of 0.28 in Bilingual Evaluation Understudy (BLEU) testing, surpassing others with a precision of 0.09; and scoring 0.33 in Jaccard Similarity, outshining others at 0.04. These findings underscore RAG's efficiency and competitiveness, promising a positive impact on various industrial sectors through advanced Artificial Intelligence (AI) technology. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#386
-
Mzwri 2023
Internet Wizard for Enhancing Open-Domain Question-Answering Chatbot Knowledge Base in Education
Chatbots have gained widespread popularity for their task automation capabilities and consistent availability in various domains, including education. However, their ability to adapt to the continuously evolving and dynamic nature of knowledge is limited. This research investigates the implementation of an internet wizard to enhance the knowledge base of an open-domain question-answering chatbot. The proposed approach leverages search engines, particularly Google, and its features, including feature snippets, knowledge graph, and organic search, in conjunction with data science and natural language models. This mechanism empowers the chatbot to dynamically access the extensive and up-to-date knowledge available on the web, enabling the provision of real time and pertinent answers to user queries sourced from web documents. A pilot study in a higher education context evaluated the chatbot's mechanism and features, confirming its proficiency in generating responses across a broad range of educational and non-educational topics. Positive feedback and high user satisfaction validate these findings. Notably, the chatbot's dynamic feature of retrieving related or follow-up questions from search engines significantly enhances student engagement and facilitates exploration of supplementary information beyond the curriculum. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1741
-
Na 2023
A Pre-training Method Inspired by Large Language Model for Power Named Entity Recognition
ACM International Conference Proceeding Series 2023;():308-312 Association for Computing Machinery 2023 DOI: 10.1145/3653081.3653131 · Ref ID: 4712 In recent years, the field of natural language processing has witnessed remarkable advancements due to the success of large language models. These models leverage the Transformer architecture and pre-training techniques to achieve impressive results. In this paper, we draw inspiration from large language models and apply these techniques into the task of named entity recognition in the domain of power grids, which is critical for building power grid knowledge graphs and question-answering systems. Specifically, we propose a BERT-CNN-BIGRU-CRF deep learning model for named entity recognition. This model effectively harnesses the semantic modeling capabilities and pre-training knowledge of BERT, which is based on the Transformer architecture. By incorporating CNN and BIGRU, the model captures and models both local and global features, respectively. The CRF layer is employed for label classification. This combination of components ensures a high level of recognition accuracy. To evaluate the performance of the proposed model, we train our model on annotated maintenance plan data. We compare its results with those of other commonly used models. The evaluation metrics include recall, precision, and F1 score, which are widely employed in named entity recognition tasks. Our proposed model achieves optimal performance across all three metrics, demonstrating its superiority over other models. © 2023 ACM. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3860
-
Nadkarni 2021
Scientific Language Models for Biomedical Knowledge Base Completion: An Empirical Study
arXiv 2021;(): 2021 Ref ID: 7466 Biomedical knowledge graphs (KGs) hold rich information on entities such as diseases, drugs, and genes. Predicting missing links in these graphs can boost many important applications, such as drug design and repurposing. Recent work has shown that general-domain language models (LMs) can serve as "soft" KGs, and that they can be fine-tuned for the task of KG completion. In this work, we study scientific LMs for KG completion, exploring whether we can tap into their latent knowledge to enhance biomedical link prediction. We evaluate several domain-specific LMs, fine-tuning them on datasets centered on drugs and diseases that we represent as KGs and enrich with textual entity descriptions. We integrate the LM-based models with KG embedding models, using a router method that learns to assign each input example to either type of model and provides a substantial boost in performance. Finally, we demonstrate the advantage of LM models in the inductive setting with novel scientific entities. Our datasets and code are made publicly available. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1248
-
Nahed 2024
Enhancing Clinical Trial Summarization: Leveraging Large Language Models and Knowledge Graphs for Entity Preservation
Lecture Notes in Networks and Systems 2024;1003 LNNS():325-336 Springer Science and Business Media Deutschland GmbH 2024 DOI: 10.1007/978-981-97-3302-6_26 · Ref ID: 4434 ClinicalTrials.gov is an accessible online medical resource for researchers, healthcare professionals, and policy designers seeking detailed information on clinical trials. Summarizing these long clinical records can significantly reduce the time needed for the database users as the process transforms comprehensive information into concise synopses, preserving the essential meaning and facilitating understanding. In this paper, we employ the Bidirectional and Auto-Regressive Transformers model to generate the trials’ brief summaries. Our contributions provide new preprocessing techniques for model training, which leads to a robust summarization model. The fine-tuned model significantly enhanced ROUGE-1, ROUGE-2, and ROUGE-L F1-scores by 14%, 23%, and 20%, respectively, compared to previous studies. Additionally, we present an innovative knowledge graph based on entity classes to assess the generated summaries. This graph not only quantifies the essential entities transformed from the original text to the summaries but also provides insights into their specific order and arrangement in sentences. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2350
-
Najjar 2005
DOKGETT - an authoring tool for cognitive model-based generation of the knowledge
Fifth IEEE International Conference on Advanced Learning Technologies (ICALT'05) 2005;():371-375 2005 DOI: 10.1109/ICALT.2005.127 · Ref ID: 6209 In this paper we present an authoring tool milieu that permits modelling graphically any subject-matter domain knowledge and transposing it automatically into related XML files. Generated contents serve as a tutor reasoning support when interacting with students engaged in learning activities through virtual learning environments. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#365
-
Naseem 2022
Incorporating Medical Knowledge to Transformer-based Language Models for Medical Dialogue Generation
21st Workshop on Biomedical Language Processing (BioNLP) at the 60th Annual Meeting of the Association-for-Computational-Linguistics (ACL) 2022;():110-115 Dublin, IRELAND Assoc Computational Linguistics-Acl 2022 Ref ID: 3184 Medical dialogue systems have the potential to assist doctors in expanding access to medical care, improving the quality of patient experiences, and lowering medical expenses. The computational methods are still in their early stages and are not ready for widespread application despite their great potential. Existing transformer-based language models have shown promising results but lack domain-specific knowledge. However, to diagnose like doctors, an automatic medical diagnosis necessitates more stringent requirements for the rationality of the dialogue in the context of relevant knowledge. In this study, we propose a new method that addresses the challenges of medical dialogue generation by incorporating medical knowledge into transformer-based language models. We present a method that leverages an external medical knowledge graph and injects triples as domain knowledge into the utterances. Automatic and human evaluation on a publicly available dataset demonstrates that incorporating medical knowledge outperforms several state-of-the-art baseline methods. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1107
-
Naveen 2024
Comparative Methods of Implementation for Different Question Answering Systems
Proceedings - 2024 5th International Conference on Intelligent Communication Technologies and Virtual Mobile Networks, ICICV 2024 2024;():567-575 Institute of Electrical and Electronics Engineers Inc. 2024 DOI: 10.1109/ICICV62344.2024.00096 · Ref ID: 4636 This research introduces an innovative Question Answering (QA) system tailored explicitly for government department inquiries regarding individuals. Harnessing the prowess of cutting-edge language models such as BERT and T5 (Text-to-Text Transfer Transformer), the system excels in understanding complex queries within diverse governmental domains. Moreover, it incorporates a specialized Knowledge Graph meticulously curated with interconnected information about people across various departments. By integrating BERT and T5 for versatile query comprehension and answer generation alongside a comprehensive People-centric Knowledge Graph, this system aims to revolutionize information retrieval within government entities. The seamless fusion of these technologies promises accurate, contextually rich responses, optimizing operational efficiency across government departments and fostering streamlined access to crucial information. © 2024 IEEE. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#378
-
Nayyeri 2023
Integrating Knowledge Graph Embeddings and Pre-trained Language Models in Hypercomplex Spaces
22nd International Semantic Web Conference (ISWC) 2023;14265():388-407 Athens, GREECE Springer International Publishing Ag 2023 DOI: 10.1007/978-3-031-47240-4_21 · Ref ID: 2930 Knowledge graphs comprise structural and textual information to represent knowledge. To predict new structural knowledge, current approaches learn representations using both types of information through knowledge graph embeddings and language models. These approaches commit to a single pre-trained language model. We hypothesize that heterogeneous language models may provide complementary information not exploited by current approaches. To investigate this hypothesis, we propose a unified framework that integrates multiple representations of structural knowledge and textual information. Our approach leverages hypercomplex algebra to model the interactions between (i) graph structural information and (ii) multiple text representations. Specifically, we utilize Dihedron models with 4*D dimensional hypercomplex numbers to integrate four different representations: structural knowledge graph embeddings, word-level representations (e.g., Word2vec and FastText), sentence-level representations (using a sentence transformer), and document-level representations (using FastText or Doc2vec). Our unified framework score the plausibility of labeled edges via Dihedron products, thus modeling pairwise interactions between the four representations. Extensive experimental evaluations on standard benchmark datasets confirm our hypothesis showing the superiority of our two new frameworks for link prediction tasks. |
Kwesi
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3138
-
Nayyeri 2023
Integrating Knowledge Graph Embeddings and Pre-trained Language Models in Hypercomplex Spaces
The Semantic Web – ISWC 2023: 22nd International Semantic Web Conference, Athens, Greece, November 6–10, 2023, Proceedings, Part I 2023;():388–407 Athens, Greece Springer-Verlag 2023 DOI: 10.1007/978-3-031-47240-4_21 · Ref ID: 7152 |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#7
-
Nguyen 2021
Advanced Semantics for Commonsense Knowledge Extraction
30th World Wide Web Conference (WWW) 2021;():2636-2647 Electr Network Assoc Computing Machinery 2021 DOI: 10.1145/3442381.3449827 · Ref ID: 3760 Commonsense knowledge (CSK) about concepts and their properties is useful for AI applications such as robust chatbots. Prior works like ConceptNet, TupleKB and others compiled large CSK collections, but are restricted in their expressiveness to subject-predicate-object (SPO) triples with simple concepts for S and monolithic strings for P and O. Also, these projects have either prioritized precision or recall, but hardly reconcile these complementary goals. This paper presents a methodology, called Ascent, to automatically build a large-scale knowledge base (KB) of CSK assertions, with advanced expressiveness and both better precision and recall than prior works. Ascent goes beyond triples by capturing composite concepts with subgroups and aspects, and by refining assertions with semantic facets. The latter are important to express temporal and spatial validity of assertions and further qualifiers. Ascent combines open information extraction with judicious cleaning using language models. Intrinsic evaluation shows the superior size and quality of the Ascent KB, and an extrinsic evaluation for QA-support tasks underlines the benefits of Ascent. A web interface, data and code can be found at https://www.mpi-inf.mpg.de/ascent. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2032
-
Ni 2024
When Do LLMs Need Retrieval Augmentation? Mitigating LLMs' Overconfidence Helps Retrieval Augmentation
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():11375-11388 Association for Computational Linguistics (ACL) 2024 Ref ID: 4253 Large Language Models (LLMs) have been found to have difficulty knowing they do not possess certain knowledge and tend to provide specious answers in such cases. Retrieval Augmentation (RA) has been extensively studied to mitigate LLMs' hallucinations. However, due to the extra overhead and unassured quality of retrieval, it may not be optimal to conduct RA all the time. A straightforward idea is to only conduct retrieval when LLMs are uncertain about a question. This motivates us to enhance the LLMs' ability to perceive their knowledge boundaries to help RA. In this paper, we first quantitatively measure LLMs' such ability and confirm their overconfidence. Then, we study how LLMs' certainty about a question correlates with their dependence on external retrieved information. We propose several methods to enhance LLMs' perception of knowledge boundaries and show that they are effective in reducing overconfidence. Additionally, equipped with these methods, LLMs can achieve comparable or even better performance of RA with much fewer retrieval calls. The code can be found at https://github.com/ShiyuNee/When-to-Retrieve. © 2024 Association for Computational Linguistics. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1087
-
Nighojkar 2022
Cognitive Modeling of Semantic Fluency Using Transformers
CEUR Workshop Proceedings 2022;3251(): CEUR-WS 2022 Ref ID: 5472 Can deep language models be explanatory models of human cognition? If so, what are their limits? In order to explore this question, we propose an approach called hyperparameter hypothesization that uses predictive hyperparameter tuning in order to find individuating descriptors of cognitive-behavioral profiles. We take the first step in this approach by predicting human performance in the semantic fluency task (SFT), a well-studied task in cognitive science that has never before been modeled using transformerbased language models (TLMs). In our task setup, we compare several approaches to predicting which word an individual performing SFT will utter next. We report preliminary evidence suggesting that, despite obvious implementational differences in how people and TLMs learn and use language, TLMs can be used to identify individual differences in human fluency task behaviors better than existing computational models, and may offer insights into human memory retrieval strategies-cognitive process not typically considered to be the kinds of things TLMs can model. Finally, we discuss the implications of this work for cognitive modeling of knowledge representations. © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3962
-
Ning 2024
UrbanKGent: A Unified Large Language Model Agent Framework for Urban Knowledge Graph Construction
arXiv 2024;(): 2024 Ref ID: 8089 Urban knowledge graph has recently worked as an emerging building block to distill critical knowledge from multi-sourced urban data for diverse urban application scenarios. Despite its promising benefits, urban knowledge graph construction (UrbanKGC) still heavily relies on manual effort, hindering its potential advancement. This paper presents UrbanKGent, a unified large language model agent framework, for urban knowledge graph construction. Specifically, we first construct the knowledgeable instruction set for UrbanKGC tasks (such as relational triplet extraction and knowledge graph completion) via heterogeneity-aware and geospatial-infused instruction generation. Moreover, we propose a tool-augmented iterative trajectory refinement module to enhance and refine the trajectories distilled from GPT-4. Through hybrid instruction fine-tuning with augmented trajectories on Llama 2 and Llama 3 family, we obtain UrbanKGC agent family, consisting of UrbanKGent-7/8/13B version. We perform a comprehensive evaluation on two real-world datasets using both human and GPT-4 self-evaluation. The experimental results demonstrate that UrbanKGent family can not only significantly outperform 31 baselines in UrbanKGC tasks, but also surpass the state-of-the-art LLM, GPT-4, by more than 10% with approximately 20 times lower cost. Compared with the existing benchmark, the UrbanKGent family could help construct an UrbanKG with hundreds of times richer relationships using only one-fifth of the data. Our data and code are available at https://github.com/usail-hkust/UrbanKGent. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2028
-
Niu 2024
WHAT DOES THE KNOWLEDGE NEURON THESIS HAVE TO DO WITH KNOWLEDGE?
12th International Conference on Learning Representations, ICLR 2024 2024;(): International Conference on Learning Representations, ICLR 2024 Ref ID: 4630 We reassess the Knowledge Neuron (KN) Thesis: an interpretation of the mechanism underlying the ability of large language models to recall facts from a training corpus. This nascent thesis proposes that facts are recalled from the training corpus through the MLP weights in a manner resembling key-value memory, implying in effect that “knowledge” is stored in the network. Furthermore, by modifying the MLP modules, one can control the language model's generation of factual information. The plausibility of the KN thesis has been demonstrated by the success of KN-inspired model editing methods (Dai et al., 2022; Meng et al., 2022). We find that this thesis is, at best, an oversimplification. Not only have we found that we can edit the expression of certain linguistic phenomena using the same model editing methods but, through a more comprehensive evaluation, we have found that the KN thesis does not adequately explain the process of factual expression. While it is possible to argue that the MLP weights store complex patterns that are interpretable both syntactically and semantically, these patterns do not constitute “knowledge.” To gain a more comprehensive understanding of the knowledge representation process, we must look beyond the MLP weights and explore recent models' complex layer structures and attention mechanisms. © 2024 12th International Conference on Learning Representations, ICLR 2024. All rights reserved. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3714
-
Niu 2024
Mitigating Hallucinations in Large Language Models via Self-Refinement-Enhanced Knowledge Retrieval
arXiv 2024;(): 2024 Ref ID: 8288 Large language models (LLMs) have demonstrated remarkable capabilities across various domains, although their susceptibility to hallucination poses significant challenges for their deployment in critical areas such as healthcare. To address this issue, retrieving relevant facts from knowledge graphs (KGs) is considered a promising method. Existing KG-augmented approaches tend to be resource-intensive, requiring multiple rounds of retrieval and verification for each factoid, which impedes their application in real-world scenarios. In this study, we propose Self-Refinement-Enhanced Knowledge Graph Retrieval (Re-KGR) to augment the factuality of LLMs' responses with less retrieval efforts in the medical field. Our approach leverages the attribution of next-token predictive probability distributions across different tokens, and various model layers to primarily identify tokens with a high potential for hallucination, reducing verification rounds by refining knowledge triples associated with these tokens. Moreover, we rectify inaccurate content using retrieved knowledge in the post-processing stage, which improves the truthfulness of generated responses. Experimental results on a medical dataset demonstrate that our approach can enhance the factual capability of LLMs across various foundational models as evidenced by the highest scores on truthfulness. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3510
-
Nori 2023
Identification of Knowledge Neurons in Protein Language Models
arXiv 2023;(): 2023 Ref ID: 7994 Neural language models have become powerful tools for learning complex representations of entities in natural language processing tasks. However, their interpretability remains a significant challenge, particularly in domains like computational biology where trust in model predictions is crucial. In this work, we aim to enhance the interpretability of protein language models, specifically the state-of-the-art ESM model, by identifying and characterizing knowledge neurons - components that express understanding of key information. After fine-tuning the ESM model for the task of enzyme sequence classification, we compare two knowledge neuron selection methods that preserve a subset of neurons from the original model. The two methods, activation-based and integrated gradient-based selection, consistently outperform a random baseline. In particular, these methods show that there is a high density of knowledge neurons in the key vector prediction networks of self-attention modules. Given that key vectors specialize in understanding different features of input sequences, these knowledge neurons could capture knowledge of different enzyme sequence motifs. In the future, the types of knowledge captured by each neuron could be characterized. |
Davis
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#438
-
Oduro-Afriyie 2023
Knowledge Graph Enabled Open-Domain Conversational Question Answering
15th International Conference on Flexible Query Answering Systems (FQAS) 2023;14113():63-76 European Soc Fuzzy Log & Technol, Mallorca, SPAIN Springer International Publishing Ag 2023 DOI: 10.1007/978-3-031-42935-4_6 · Ref ID: 3056 With the advent of natural language enabled applications, there has been a growing appetite for conversational question answering systems. This demand is being largely satisfied with the help of such powerful language models as Open AI's GPT models, Google's BERT, and BigScience's BLOOM. However, the astounding amount of training data and computing resources required to create such models is a huge challenge. Furthermore, for such systems, catering to multiple application domains typically requires the acquisition of even more training data. We discuss an alternative approach to the problem of open-domain conversational question answering by utilizing knowledge graphs to capture relevant information from a body of text in any domain. We achieve this by allowing the relations of the knowledge graphs to be drawn directly from the body of text being processed, rather than from a fixed ontology. By connecting this process with SPARQL queries generated from natural language questions, we demonstrate the foundations of an open-domain question answering system that requires no training and can switch domains flexibly and seamlessly. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#208
-
Oduro-Afriyie 2023
Enabling the Informed Patient Paradigm with Secure and Personalized Medical Question Answering
14th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (ACM-BCB) 2023;(): Houston, TX Assoc Computing Machinery 2023 DOI: 10.1145/3584371.3613016 · Ref ID: 3018 Quality patient care is a complex and multifaceted problem requiring the integration of data from multiple sources. We propose Medicient, a knowledge-graph-based question answering system that processes heterogeneous data sources, including patient health records, drug databases, and medical literature, into a unified knowledge graph with zero training. The knowledge graph is then utilized to provide personalized recommendations for treatment or medication. The system leverages the power of large language models for question understanding and natural language response generation, while hiding sensitive patient information. We compare our system to a large language model (ChatGPT), which does not have access to patient health records, and show that our system provides better recommendations. This study contributes to a growing body of research on knowledge graphs and their applications in healthcare. |
Mike
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1710
-
Oelen 2024
ORKG ASK: a Neuro-symbolic Scholarly Search and Exploration System
CEUR Workshop Proceedings 2024;3759(): CEUR-WS 2024 Ref ID: 4226 Purpose: Finding scholarly articles is a time-consuming and cumbersome activity, yet crucial for conducting science.Due to the growing number of scholarly articles, new scholarly search systems are needed to effectively assist researchers in finding relevant literature.Methodology: We take a neuro-symbolic approach to scholarly search and exploration by leveraging state-of-the-art components, including semantic search, Large Language Models (LLMs), and Knowledge Graphs (KGs).The semantic search component composes a set of relevant articles.From this set of articles, information is extracted and presented to the user.Findings: The presented system, called ORKG ASK (Assistant for Scientific Knowledge), provides a production-ready search and exploration system.Our preliminary evaluation indicates that our proposed approach is indeed suitable for the task of scholarly information retrieval.Value: With ORKG ASK, we present a next-generation scholarly search and exploration system and make it available online.Additionally, the system components are open source with a permissive license. © 2024 Copyright for this paper by its authors. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1631
-
Omar 2023
Measurement of ChatGPT Performance in Mapping Natural Language Speficaction into an Entity Relationship Diagram
2023 IEEE 11th International Conference on Systems and Control, ICSC 2023 2023;():530-535 Institute of Electrical and Electronics Engineers Inc. 2023 DOI: 10.1109/ICSC58660.2023.10449869 · Ref ID: 4961 This paper explores the entity relationship diagram, a popular conceptual model used to depict entities, attributes, and relationships graphically. To help with this, we use ChatGPT, a sophisticated language model based on the GPT architecture, which can translate natural language text into an entity relationship diagram. The paper details the process of evaluating how well ChatGPT can perform compared to other state-of-the-art approaches for entity and relationship extraction. Our experimental findings demonstrate the strong ability of ChatGPT to translate natural language text into entity relationship diagrams, which has potential applications for knowledge graph building, data integration, and database schema design. Moreover, it can aid in automating the extraction and organization of information from unstructured text data, thereby simplifying the study of complex systems. © 2023 IEEE. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1511
-
Omar 2021
A knowledge graph question-Answering platform trained independently of the graph
CEUR Workshop Proceedings 2021;2980(): CEUR-WS 2021 Ref ID: 5681 We will demonstrate KGQAn, a question-Answering platform trained independently of KGs. KGQAn transforms a question into semantically equivalent SPARQL queries via a novel three-phase strategy based on natural language models trained generally for understanding and leveraging short English text. Without preprocessing or annotated questions on KGs, KGQAn outperformed the existing systems in KG question answering by an improvement of at least 33% in F1-measure and 61% in precision. During the demo, the audience will experience KGQAn for question answering on real KGs of topics of interest to them, such as DBpedia and OpenCitations Graph, and review the generated SPARQL queries and answers. A demo video is available online. © 2021 CEUR-WS. All rights reserved. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#541
-
Omeliyanenko 2020
LM4KG: Improving Common Sense Knowledge Graphs with Language Models
19th International Semantic Web Conference (ISWC) 2020;12506():456-473 Athens, GREECE Springer International Publishing Ag 2020 DOI: 10.1007/978-3-030-62419-4_26 · Ref ID: 3117 Language Models (LMs) and Knowledge Graphs (KGs) are both active research areas inMachine Learning and SemanticWeb. While LMs have brought great improvements for many downstream tasks on their own, they are often combined with KGs providing additionally aggregated, well structured knowledge. Usually, this is done by leveraging KGs to improve LMs. But what happens if we turn this around and use LMs to improve KGs? In this paper, we propose a method enabling the use of the knowledge inherently encoded in LMs to automatically improve explicit knowledge represented in common sense KGs. Edges in these KGs represent relations between concepts, but the strength of the relations is often not clear. We propose to transform KG relations to natural language sentences, allowing us to utilize the information contained in large LMs to rate these sentences through a new perplexity-based measure, Refined Edge WEIGHTing (REWEIGHT). We test our scoring scheme REWEIGHT on the popular LM BERT to produce new weights for the edges in the well-known ConceptNet KG. By retrofitting existing word embeddings to our modified ConceptNet, we create ConceptNet NumBERTbatch embeddings and show that these outperform the original ConceptNet Numberbatch on multiple established semantic similarity datasets. |
Mike
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#87
-
Omeliyanenko 2023
CapsKG: Enabling Continual Knowledge Integration in Language Models for Automatic Knowledge Graph Completion
22nd International Semantic Web Conference (ISWC) 2023;14265():618-636 Athens, GREECE Springer International Publishing Ag 2023 DOI: 10.1007/978-3-031-47240-4_33 · Ref ID: 2932 Automated completion of knowledge graphs is a popular topic in the Semantic Web community that aims to automatically and continuously integrate new appearing knowledge into knowledge graphs using artificial intelligence. Recently, approaches that leverage implicit knowledge from language models for this task have shown promising results. However, by fine-tuning language models directly to the domain of knowledge graphs, models forget their original language representation and associated knowledge. An existing solution to address this issue is a trainable adapter, which is integrated into a frozen language model to extract the relevant knowledge without altering the model itself. However, this constrains the generalizability to the specific extraction task and by design requires new and independent adapters to be trained for new knowledge extraction tasks. This effectively prevents the model from benefiting from existing knowledge incorporated in previously trained adapters. In this paper, we propose to combine the benefits of adapters for knowledge graph completion with the idea of integrating capsules, introduced in the field of continual learning. This allows the continuous integration of knowledge into a joint model by sharing and reusing previously trained capsules. We find that our approach outperforms solutions using traditional adapters, while requiring notably fewer parameters for continuous knowledge integration. Moreover, we show that this architecture benefits significantly from knowledge sharing in low-resource situations, outperforming adapter-based models on the task of link prediction. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3209
-
Oruganti 2023
Automating Knowledge Acquisition for Content-Centric Cognitive Agents Using LLMs
arXiv 2023;(): 2023 Ref ID: 8014 The paper describes a system that uses large language model (LLM) technology to support the automatic learning of new entries in an intelligent agent's semantic lexicon. The process is bootstrapped by an existing non-toy lexicon and a natural language generator that converts formal, ontologically-grounded representations of meaning into natural language sentences. The learning method involves a sequence of LLM requests and includes an automatic quality control step. To date, this learning method has been applied to learning multiword expressions whose meanings are equivalent to those of transitive verbs in the agent's lexicon. The experiment demonstrates the benefits of a hybrid learning architecture that integrates knowledge-based methods and resources with both traditional data analytics and LLMs. |
Kwesi
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3737
-
Otal 2024
A New Perspective on ADHD Research: Knowledge Graph Construction with LLMs and Network Based Insights
arXiv 2024;(): 2024 Ref ID: 8608 Attention-Deficit/Hyperactivity Disorder (ADHD) is a challenging disorder to study due to its complex symptomatology and diverse contributing factors. To explore how we can gain deeper insights on this topic, we performed a network analysis on a comprehensive knowledge graph (KG) of ADHD, constructed by integrating scientific literature and clinical data with the help of cutting-edge large language models. The analysis, including k-core techniques, identified critical nodes and relationships that are central to understanding the disorder. Building on these findings, we curated a knowledge graph that is usable in a context-aware chatbot (Graph-RAG) with Large Language Models (LLMs), enabling accurate and informed interactions. Our knowledge graph not only advances the understanding of ADHD but also provides a powerful tool for research and clinical applications. |
Kwesi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3454
-
Pal 2024
Gemini Goes to Med School: Exploring the Capabilities of Multimodal Large Language Models on Medical Challenge Problems & Hallucinations
arXiv 2024;(): 2024 Ref ID: 8091 Large language models have the potential to be valuable in the healthcare industry, but it's crucial to verify their safety and effectiveness through rigorous evaluation. For this purpose, we comprehensively evaluated both open-source LLMs and Google's new multimodal LLM called Gemini across Medical reasoning, hallucination detection, and Medical Visual Question Answering tasks. While Gemini showed competence, it lagged behind state-of-the-art models like MedPaLM 2 and GPT-4 in diagnostic accuracy. Additionally, Gemini achieved an accuracy of 61.45% on the medical VQA dataset, significantly lower than GPT-4V's score of 88%. Our analysis revealed that Gemini is highly susceptible to hallucinations, overconfidence, and knowledge gaps, which indicate risks if deployed uncritically. We also performed a detailed analysis by medical subject and test type, providing actionable feedback for developers and clinicians. To mitigate risks, we applied prompting strategies that improved performance. Additionally, we facilitated future research and development by releasing a Python module for medical LLM evaluation and establishing a dedicated leaderboard on Hugging Face for medical domain LLMs. Python module can be found at https://github.com/promptslab/RosettaEval |
yuexi
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1653
-
Palma 2024
Modelling Interestingness: AWorkflow for Surprisal-based Knowledge Mining in Narrative Semantic Networks
CEUR Workshop Proceedings 2024;3749(): CEUR-WS 2024 Ref ID: 4325 This working paper outlines ongoing and planned efforts aimed at achieving an objective modelling of interestingness in cross-domain knowledge bases. In pursuit of this objective, clickstream data serves as a primary component for developing a novel measure of entity-related popularity. This measure is then integrated with two couple-related similarity measures, culminating in the formulation of a new interestingness law. This principled formalization is designed to undergo human validation, ultimately enhancing its reliability and comprehensiveness. The present contribution is intended to be propaedeutic to the development of a pipeline having a Knowledge Graph as input, and an expanded version of the same as output, whereby every link is labelled by an interestingness score, thus highlighting the most interesting paths, determined according to the proposed domain-specific heuristics for interestingness detection. This work is expected to yield significant benefits for Automatic Story Generation. Although this discipline, aided by Machine Learning, has made remarkable progress in surface-level text realization, it still grapples with producing qualitatively rich outputs that offer substantive informativeness. To address this challenge, a Knowledge Graph (particularly its most compelling paths identified through the proposed methodology) is anticipated to integrate the Large Language Model, thus harnessing the final output with the contextual information selected by users throughout the entire workflow- a scenario which is particularly valuable in educational settings, where generated stories frequently serve pedagogical purposes. © 2024 Copyright for this paper by its authors. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3939
-
Pan 2024
Towards Unified Multimodal Editing with Enhanced Knowledge Collaboration
arXiv 2024;(): 2024 Ref ID: 8639 The swift advancement in Multimodal LLMs (MLLMs) also presents significant challenges for effective knowledge editing. Current methods, including intrinsic knowledge editing and external knowledge resorting, each possess strengths and weaknesses, struggling to balance the desired properties of reliability, generality, and locality when applied to MLLMs. In this paper, we propose UniKE, a novel multimodal editing method that establishes a unified perspective and paradigm for intrinsic knowledge editing and external knowledge resorting. Both types of knowledge are conceptualized as vectorized key-value memories, with the corresponding editing processes resembling the assimilation and accommodation phases of human cognition, conducted at the same semantic levels. Within such a unified framework, we further promote knowledge collaboration by disentangling the knowledge representations into the semantic and truthfulness spaces. Extensive experiments validate the effectiveness of our method, which ensures that the post-edit MLLM simultaneously maintains excellent reliability, generality, and locality. The code for UniKE will be available at \url{https://github.com/beepkh/UniKE}. |
Mike
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1189
-
Pan 2023
Differentiable Rule Extraction with Large Language Model for Knowledge Graph Reasoning
J. Frontier. Comput. Sci. Technol. 2023;17(10):2403-2412 2023 DOI: 10.3778/j.issn.1673-9418.2306049 · Ref ID: 4808 Knowledge graph (KG) reasoning is to predict missing entities or relationships in incomplete triples, complete structured knowledge, and apply to different downstream tasks. Different from black-box methods which are widely studied, such as methods based on representation learning, the method based on rule extraction achieves an interpretable reasoning paradigm by generalizing first-order logic rules from the KG. To address the gap between discrete symbolic space and continuous embedding space, a differentiable rule extracting method based on the large pre-trained language model (DRaM) is proposed, which integrates discrete first-order logical rules with continuous vector space. In view of the influence of atom sequences in first-order logic rules for the reasoning process, a large pre-trained language model is introduced to encode the reasoning process. The differentiable method DRaM, which integrates first-order logical rules, achieves good results in link prediction tasks on three knowledge graph datasets, Family, Kinship and UMLS, especially for the indicator Hits@10. Comprehensive experimental results show that DRaM can effectively solve the problems of differentiable reasoning on the KGs, and can extract first-order logic rules with confidences from the reasoning process. DRaM not only enhances the reasoning performance with the help of first-order logic rules, but also enhances the interpretability of the method. © 2023 Journal of Computer Engineering and Applications Beijing Co., Ltd.; Science Press. All rights reserved. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1378
-
Panda 2024
HOLMES: Hyper-Relational Knowledge Graphs for Multi-hop Question Answering using LLMs
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;1():13263-13282 Association for Computational Linguistics (ACL) 2024 Ref ID: 4305 Given unstructured text, Large Language Models (LLMs) are adept at answering simple (single-hop) questions. However, as the complexity of the questions increase, the performance of LLMs degrade. We believe this is due to the overhead associated with understanding the complex question followed by filtering and aggregating unstructured information in the raw text. Recent methods try to reduce this burden by integrating structured knowledge triples into the raw text, aiming to provide a structured overview that simplifies information processing. However, this simplistic approach is query-agnostic and the extracted facts are ambiguous as they lack context. To address these drawbacks and to enable LLMs to answer complex (multi-hop) questions with ease, we propose to use a knowledge graph (KG) that is context-aware and is distilled to contain query-relevant information. The use of our compressed distilled KG as input to the LLM results in our method utilizing up to 67% fewer tokens to represent the query relevant information present in the supporting documents, compared to the state-of-the-art (SoTA) method. Our experiments show consistent improvements over the SoTA across several metrics (EM, F1, BERTScore, and Human Eval) on two popular benchmark datasets (HotpotQA and MuSiQue). © 2024 Association for Computational Linguistics. |
Srividya
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2043
-
Papaluca 2024
Zero- and Few-Shots Knowledge Graph Triplet Extraction with Large Language Models
KaLLM 2024 - 1st Workshop on Knowledge Graphs and Large Language Models, Proceedings of the Workshop 2024;():12-23 Association for Computational Linguistics (ACL) 2024 Ref ID: 4366 In this work, we tested the Triplet Extraction (TE) capabilities of a variety of Large Language Models (LLMs) of different sizes in the Zero- and Few-Shots settings. In detail, we proposed a pipeline that dynamically gathers contextual information from a Knowledge Base (KB), both in the form of context triplets and of (sentence, triplets) pairs as examples, and provides it to the LLM through a prompt. The additional context allowed the LLMs to be competitive with all the older fully trained baselines based on the Bidirectional Long Short-Term Memory (BiLSTM) Network architecture. We further conducted a detailed analysis of the quality of the gathered KB context, finding it to be strongly correlated with the final TE performance of the model. In contrast, the size of the model appeared to only logarithmically improve the TE capabilities of the LLMs. We release the code on GitHub 1 for reproducibility. ©2024 Association for Computational Linguistics. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2896
-
Paraiso-Medina 2015
Semantic Normalization and Query Abstraction Based on SNOMED-CT and HL7: Supporting Multicentric Clinical Trials
IEEE Journal of Biomedical and Health Informatics 2015;19(3):1061-1067 2015 DOI: 10.1109/JBHI.2014.2357025 · Ref ID: 6480 Advances in the use of omic data and other biomarkers are increasing the number of variables in clinical research. Additional data have stratified the population of patients and require that current studies be performed among multiple institutions. Semantic interoperability and standardized data representation are a crucial task in the management of modern clinical trials. In the past few years, different efforts have focused on integrating biomedical information. Due to the complexity of this domain and the specific requirements of clinical research, the majority of data integration tasks are still performed manually. This paper presents a semantic normalization process and a query abstraction mechanism to facilitate data integration and retrieval. A process based on well-established standards from the biomedical domain and the latest semantic web technologies has been developed. Methods proposed in this paper have been tested within the EURECA EU research project, where clinical scenarios require the extraction of semantic knowledge from biomedical vocabularies. The aim of this paper is to provide a novel method to abstract from the data model and query syntax. The proposed approach has been compared with other initiatives in the field by storing the same dataset with each of those solutions. Results show an extended functionality and query capabilities at the cost of slightly worse performance in query execution. Implementations in real settings have shown that following this approach, usable interfaces can be developed to exploit clinical trial data outcomes. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2119
-
Paraschiv 2015
Analyzing the Semantic Relatedness of Paper Abstracts: An Application to the Educational Research Field
2015 20th International Conference on Control Systems and Computer Science 2015;():759-764 2015 DOI: 10.1109/CSCS.2015.146 · Ref ID: 6145 Each domain, along with its knowledge base, changes over time and every timeframe is centered on specific topics that emerge from different ongoing research projects. As searching for relevant resources is a time-consuming process, the automatic extraction of the most important and relevant articles from a domain becomes essential in supporting researchers in their day-to-day activities. The proposed analysis extends other previous researches focused on extracting co-citations between the papers, with the purpose of comparing their overall importance within the domain from a semantic perspective. Our method focuses on the semantic analysis of paper abstracts by using Natural Language Processing (NLP) techniques such as Latent Semantic Analysis, Latent Dirichlet Allocation or specific ontology distances, i.e., Word Net. Moreover, the defined mechanisms are enforced on two different sub domains from the corpora generated around the keywords "e-learning" and "computer". Graph visual representations are used to highlight the keywords of each sub domain, links among concepts and between articles, as well as specific document similarity views, or scores reflecting the keyword-abstract overlaps. In the end, conclusions and future improvements are presented, emphasizing nevertheless the key elements of our research support framework. |
Mike
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#744
-
Park 2023
Selective UMLS knowledge infusion for biomedical question answering
One of the artificial intelligence applications in the biomedical field is knowledge-intensive question-answering. As domain expertise is particularly crucial in this field, we propose a method for efficiently infusing biomedical knowledge into pretrained language models, ultimately targeting biomedical question-answering. Transferring all semantics of a large knowledge graph into the entire model requires too many parameters, increasing computational cost and time. We investigate an efficient approach that leverages adapters to inject Unified Medical Language System knowledge into pretrained language models, and we question the need to use all semantics in the knowledge graph. This study focuses on strategies of partitioning knowledge graph and either discarding or merging some for more efficient pretraining. According to the results of three biomedical question answering finetuning datasets, the adapters pretrained on semantically partitioned group showed more efficient performance in terms of evaluation metrics, required parameters, and time. The results also show that discarding groups with fewer concepts is a better direction for small datasets, and merging these groups is better for large dataset. Furthermore, the metric results show a slight improvement, demonstrating that the adapter methodology is rather insensitive to the group formulation. |
Xinchen
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3465
-
Park 2024
Generative Subgraph Retrieval for Knowledge Graph-Grounded Dialog Generation
arXiv 2024;(): 2024 Ref ID: 8696 Knowledge graph-grounded dialog generation requires retrieving a dialog-relevant subgraph from the given knowledge base graph and integrating it with the dialog history. Previous works typically represent the graph using an external encoder, such as graph neural networks, and retrieve relevant triplets based on the similarity between single-vector representations of triplets and the dialog history. However, these external encoders fail to leverage the rich knowledge of pretrained language models, and the retrieval process is also suboptimal due to the information bottleneck caused by the single-vector abstraction of the dialog history. In this work, we propose Dialog generation with Generative Subgraph Retrieval (DialogGSR), which retrieves relevant knowledge subgraphs by directly generating their token sequences on top of language models. For effective generative subgraph retrieval, we introduce two key methods: (i) structure-aware knowledge graph linearization with self-supervised graph-specific tokens and (ii) graph-constrained decoding utilizing graph structural proximity-based entity informativeness scores for valid and relevant generative retrieval. DialogGSR achieves state-of-the-art performance in knowledge graph-grounded dialog generation, as demonstrated on OpenDialKG and KOMODIS datasets. |
Xinchen
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#120
-
Parolin 2021
CoMe-KE: A New Transformers Based Approach for Knowledge Extraction in Conflict and Mediation Domain
9th IEEE International Conference on Big Data (IEEE BigData) 2021;():1449-1459 Electr Network Ieee 2021 DOI: 10.1109/BigData52589.2021.9672080 · Ref ID: 3776 Knowledge discovery and extraction approaches attract special attention across industries and areas moving toward the 5V Era. In the political and social sciences, scholars and governments dedicate considerable resources to develop intelligent systems for monitoring, analyzing and predicting conflicts and affairs involving political entities across the globe. Such systems rely on background knowledge from external knowledge bases, that conflict experts commonly maintain manually. The high costs and extensive human efforts associated with updating and extending these repositories often compromise their correctness of. Here we introduce CoMe-KE (Conflict and Mediation Knowledge Extractor) to extend automatically knowledge bases about conflict and mediation events. We explore state-of-the-art natural language models to discover new political entities, their roles and status from news. We propose a distant supervised method and propose an innovative zero-shot approach based on a dynamic hypothesis procedure. Our methods leverage pre-trained models through transfer learning techniques to obtain excellent results with no need for a labeled data. Finally, we demonstrate the superiority of our method through a comprehensive set of experiments involving two study cases in the social sciences domain. CoMe-KE significantly outperforms the existing baseline, with (on average) double of the performance retrieving new political entities. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3140
-
Patel 2020
Jointly Learning Knowledge Graph Embeddings, Fine Grain Entity Types and Language Models
2020;(): University of Maryland, Baltimore County 2020 Ref ID: 7196 |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2215
-
Paula 2015
Building Up Conceptual Spaces: An ESOM Supported Strategy
2015 Brazilian Conference on Intelligent Systems (BRACIS) 2015;():122-127 2015 DOI: 10.1109/BRACIS.2015.63 · Ref ID: 6728 Intelligent agents need robust knowledge representation schemes to model and solve complex real-world problems. A historical approach is the symbolic representation proposed in classic AI. Although symbolic representations have their appeal, the use of abstract symbols, representing general knowledge about the world, brings limitations to the way agents develop certain cognitive functions, as in the case of language. In the standard symbolic approach, there is no ground for the symbols used internally by the agents, creating a situation known as the symbol grounding problem, as explained by Harnad (1990). To deal with this problem, Gardenfors (2004) introduced a semantic theory named conceptual spaces, which attribute meaning to linguistic symbols. The geometry of such spaces forms a robust structure to conceptualize information. In this paper, we use an unsupervised classifier named Evolving Self-Organizing Maps (ESOM) to act as the computational implementation of conceptual spaces. Our results confirmed ESOM's capability to create concepts, aiding agents in reaching a linguistic consensus about different words exchanged during an objects naming game. Besides providing a way for symbols to get meaning on a biologically realistic way, these results also open possibilities for other characteristics of conceptual spaces to be applied on the study of artificial language, as e.g. Grammatical language. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#382
-
Payumo 2024
Intelligent Knowledge Base Search Tool using Large Language Model and Graph Neural Network
Conference on Pattern Recognition and Prediction XXXV 2024;13040(): National Harbor, MD Spie-Int Soc Optical Engineering 2024 DOI: 10.1117/12.3014075 · Ref ID: 3359 Within many organizations, a vast number of communications, memos, reports and documents have been accumulated in internal servers. Efficiently discovering relevant entries can reduce time spent addressing organizational needs such as personnel skills matching or anomaly resolution. However, per organization, information retrieval on these disparate data types can be challenging, as systems must be designed for their domain while accounting for unstructured and inconsistent datasets. Traditional querying via search terms often requires relevancy tuning by subject matter experts which makes it difficult to build retrieval systems. We argue that development of retrieval systems can be simplified and enhanced by embedding data with Large Language Models (LLMs), organizing information in a Knowledge Graph (KG) structure, and further encoding their relational features through a Graph Neural Network (GNN). One of the major challenges of using GNNs for information retrieval is optimizing negative edge selection. Training GNNs requires a balanced ratio between positive and negative edges however the space of negative edges is exponentially larger than positive edges. In this work, we extend the LLM-GNN hybrid architecture by applying ensemble voting on a set of trained LLM-GNNs. Preliminary results have shown modest improvement on our personnel-document matching tasks. This work contributes to a developmental effort that aims to help engineers and scientists find new research opportunities, learn from past mistakes, and quickly address future needs. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1433
-
Paz-Argaman 2024
Into the Unknown: Generating Geospatial Descriptions for New Environments
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():2259-2273 Association for Computational Linguistics (ACL) 2024 Ref ID: 4214 Similar to vision-and-language navigation (VLN) tasks that focus on bridging the gap between vision and language for embodied navigation, the new Rendezvous (RVS) task requires reasoning over allocentric spatial relationships (independent of the observer's viewpoint) using non-sequential navigation instructions and maps. However, performance substantially drops in new environments with no training data. Using opensource descriptions paired with coordinates (e.g., Wikipedia) provides training data but suffers from limited spatially-oriented text resulting in low geolocation resolution. We propose a large-scale augmentation method for generating high-quality synthetic data for new environments using readily available geospatial data. Our method constructs a grounded knowledge-graph, capturing entity relationships. Sampled entities and relations (“shop north of school”) generate navigation instructions via (i) generating numerous templates using context-free grammar (CFG) to embed specific entities and relations; (ii) feeding the entities and relation into a large language model (LLM) for instruction generation. A comprehensive evaluation on RVS, showed that our approach improves the 100-meter accuracy by 45.83% on unseen environments. Furthermore, we demonstrate that models trained with CFG-based augmentation achieve superior performance compared with those trained with LLM-based augmentation, both in unseen and seen environments. These findings suggest that the potential advantages of explicitly structuring spatial information for text-based geospatial reasoning in previously unknown, can unlock data-scarce scenarios. © 2024 Association for Computational Linguistics. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1811
-
Pei 2024
Research on Public Security Professional Small Sample Knowledge Extraction Method Based on Large Language Model
J. Frontier. Comput. Sci. Technol. 2024;18(10):2630-2642 2024 DOI: 10.3778/j.issn.1673-9418.2403039 · Ref ID: 3897 The rapid development of informatization and digitalization in public security business has generated a large amount of law enforcement case data in public security work. However, due to various types of text and large amount of information, front-line police officers often face problems such as low reading efficiency and difficulty in aggregating information in the process of reading case files. In order to further utilize the law enforcement case text, it is necessary to conduct intelligent analysis and knowledge extraction. However, due to the professionalism, data sensitivity, confidentiality of public security professional law enforcement case text, as well as the requirements of public security data going out of the network, only a small number of learning training samples can be obtained, and the traditional deep learning model has unsatisfactory extraction effect. Therefore, this paper proposes to build a large language model in vertical fields with fewer resources and data, and realize the adaptation of the model to the public security profession. The model uses knowledge editing technology MEMIT (mess-editing memory in a transformer), low-resource fine-tuning technology LoRA (low-rank adaptation), and prompt templates to improve the model’s understanding of public security knowledge such as police terminology and common sense. Moreover, in order to further improve the knowledge extraction effect of the model, a small sample law enforcement case text data extraction process is designed to better integrate the professional knowledge related to the case in the model. Experimental results show that the accuracy of the public security professional vertical field large language model integrated with the extraction process in various knowledge extraction tasks is significantly improved compared with the traditional methods, which helps front-line police officers quickly, objectively and accurately analyze law enforcement case text, dig out potential case information, and support the intelligent development of public security work. © 2024 Journal of Computer Engineering and Applications Beijing Co., Ltd.; Science Press. All rights reserved. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1143
-
Peng 2022
COPEN: Probing Conceptual Knowledge in Pre-trained Language Models
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 2022;():5015-5035 Association for Computational Linguistics (ACL) 2022 Ref ID: 5410 Conceptual knowledge is fundamental to human cognition and knowledge bases. However, existing knowledge probing works only focus on evaluating factual knowledge of pre-trained language models (PLMs) and ignore conceptual knowledge. Since conceptual knowledge often appears as implicit commonsense behind texts, designing probes for conceptual knowledge is hard. Inspired by knowledge representation schemata, we comprehensively evaluate conceptual knowledge of PLMs by designing three tasks to probe whether PLMs organize entities by conceptual similarities, learn conceptual properties, and conceptualize entities in contexts, respectively. For the tasks, we collect and annotate 24k data instances covering 393 concepts, which is COPEN, a COnceptual knowledge Probing bENchmark. Extensive experiments on different sizes and types of PLMs show that existing PLMs systematically lack conceptual knowledge and suffer from various spurious correlations. We believe this is a critical bottleneck for realizing human-like cognition in PLMs. COPEN and our codes are publicly released at https://github.com/THU-KEG/COPEN. © 2022 Association for Computational Linguistics. |
Mike
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1171
-
Peng 2024
Deja vu: Contrastive Historical Modeling with Prefix-tuning for Temporal Knowledge Graph Reasoning
Findings of the Association for Computational Linguistics: NAACL 2024 - Findings 2024;():1178-1191 Association for Computational Linguistics (ACL) 2024 Ref ID: 4603 Temporal Knowledge Graph Reasoning (TKGR) is the task of inferring missing facts for incomplete TKGs in complex scenarios (e.g., transductive and inductive settings), which has been gaining increasing attention. Recently, to mitigate dependence on structured connections in TKGs, text-based methods have been developed to utilize rich linguistic information from entity descriptions. However, suffering from the enormous parameters and inflexibility of pre-trained language models, existing text-based methods struggle to balance the textual knowledge and temporal information with computationally expensive purpose-built training strategies. To tap the potential of text-based models for TKGR in various complex scenarios, we propose ChapTER, a Contrastive historical modeling framework with prefix-tuning for TEmporal Reasoning. ChapTER feeds history-contextualized text into the pseudo-Siamese encoders to strike a textual-temporal balance via contrastive estimation between queries and candidates. By introducing virtual time prefix tokens, it applies a prefix-based tuning method to facilitate the frozen PLM capable for TKGR tasks under different settings. We evaluate ChapTER on four transductive and three few-shot inductive TKGR benchmarks, and experimental results demonstrate that ChapTER achieves superior performance compared to competitive baselines with only 0.17% tuned parameters. We conduct thorough analysis to verify the effectiveness, flexibility and efficiency of ChapTER. © 2024 Association for Computational Linguistics. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1705
-
Peng 2023
Ontology Matching using Textual Class Descriptions
CEUR Workshop Proceedings 2023;3591():67-72 CEUR-WS 2023 Ref ID: 5079 In this paper, we propose TEXTO, a TEXT-based Ontology matching system. This matcher leverages the rich semantic information of classes available in most ontologies by a combination of a pre-trained word embedding model and a pre-trained language model. Its performance is evaluated on the datasets of the OAEI Common Knowledge Graphs Track, augmented with the description of each class, and a new dataset based on the refreshed alignment of Schema.org and Wikidata. Our results demonstrate that TEXTO outperforms all state-of-art matchers in terms of precision, recall and F1 score. In particular, we show that almost perfect class alignment can be achieved using textual content only, excluding any structural information like the graph of classes or the instances of each class. © 2023 Copyright for this paper by its authors. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3642
-
Peng 2024
Learning Rules from KGs Guided by Language Models
arXiv 2024;(): 2024 Ref ID: 8594 Advances in information extraction have enabled the automatic construction of large knowledge graphs (e.g., Yago, Wikidata or Google KG), which are widely used in many applications like semantic search or data analytics. However, due to their semi-automatic construction, KGs are often incomplete. Rule learning methods, concerned with the extraction of frequent patterns from KGs and casting them into rules, can be applied to predict potentially missing facts. A crucial step in this process is rule ranking. Ranking of rules is especially challenging over highly incomplete or biased KGs (e.g., KGs predominantly storing facts about famous people), as in this case biased rules might fit the data best and be ranked at the top based on standard statistical metrics like rule confidence. To address this issue, prior works proposed to rank rules not only relying on the original KG but also facts predicted by a KG embedding model. At the same time, with the recent rise of Language Models (LMs), several works have claimed that LMs can be used as alternative means for KG completion. In this work, our goal is to verify to which extent the exploitation of LMs is helpful for improving the quality of rule learning systems. |
Kwesi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#493
-
Perevalov 2024
Language Models as SPARQL Query Filtering for Improving the Quality of Multilingual Question Answering over Knowledge Graphs
24th International Conference Web Engineering (ICWE) 2024;14629():3-18 Tampere, FINLAND Springer International Publishing Ag 2024 DOI: 10.1007/978-3-031-62362-2_1 · Ref ID: 3152 Question Answering systems working over Knowledge Graphs (KGQA) generate a ranked list of SPARQL query candidates for a given natural-language question. In this paper, we follow our long-term research agenda of providing trustworthy KGQA systems - here - by presenting a query filtering approach that utilizes (large) language models (LMs/LLMs), s.t., correct and incorrect queries can be distinguished. In contrast to the previous work, we address here multilingual questions represented in major languages (English, German, French, Spanish, and Russian), and confirm the generalizability of our approach by also evaluating it on low-resource languages (Ukrainian, Armenian, Lithuanian, Belarusian, and Bashkir). For our experiments, we used the following LMs: BERT, DistilBERT, Mistral, Zephyr, GPT-3.5, and GPT-4. The LMs were applied to the KGQA systems - QAnswer and MemQA - as SPARQL query filters. The approach was evaluated on the multilingual Wikidata-based dataset QALD-9-plus. The experimental results suggest that the KGQA systems achieve quality improvements for all languages when using our query-filtering approach. |
Kwesi
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1117
-
Peroni 2014
Conclusions
Law. Gov. Technol. Ser. 2014;15():257-262 Springer Science and Business Media B.V. 2014 DOI: 10.1007/978-3-319-04777-5_7 · Ref ID: 5805 In this chapter, I conclude the discussion of my work on Semantic Publishing. In particular, I summarise my own personal contributions in order to address one of the main issues of this field, i.e., the linking of a text to the formal representation of its meaning and thus the representation of its structure and of its argumentative discourse. In addition, I summarise my own contribution on the development of interfaces to hide the complexity of markup and ontology formalisms behind user-friendly views in order to help users of Semantic Publishing (e.g., scholars, publishers, archivists, librarians, etc.) that may have difficulties in interacting with Semantic Publishing technologies. Finally, I conclude the chapter introducing planned future works for all the languages, models and tools presented. © 2014, Springer International Publishing Switzerland. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#988
-
Pertsas 2024
An Annotated Dataset for Transformer-based Scholarly Information Extraction and Linguistic Linked Data Generation
9th Workshop on Linked Data in Linguistics: Resources, Applications, Best Practices, LDL 2024 at LREC-COLING 2024 - Workshop Proceedings 2024;():84-93 European Language Resources Association (ELRA) 2024 Ref ID: 4635 We present a manually curated and annotated, multidisciplinary dataset of 15,262 sentences from research articles (abstract and main text) that can be used for transformer-based extraction from scholarly publications of three types of entities: 1) research methods, named entities of variable length, 2) research goals, entities that appear as textual spans of variable length with mostly fixed lexico-syntactic-structure, and 3) research activities, entities that appear as textual spans of variable length with complex lexico-syntactic structure. We explore the capabilities of our dataset by using it for training/fine-tuning various ML and transformer-based models. We compare our finetuned models as well as LLM responses (chat-GPT 3.5) based on 10-shot learning, by measuring F1 scores in token-based, entity-based strict and entity-based partial evaluations across interdisciplinary and discipline-specific datasets in order to capture any possible differences in discipline-oriented writing styles. Results show that fine tuning of transformer-based models significantly outperforms the performance of few-shot learning of LLMs such as chat-GPT, highlighting the significance of annotation datasets in such tasks. Our dataset can also be used as a source for linguistic linked data by itself. We demonstrate this by presenting indicative queries in SPARQL, executed over such an RDF knowledge graph. © 2024 ELRA Language Resource Association. |
Srividya
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1676
-
Phatak 2024
Narrating Causal Graphs with Large Language Models
Proceedings of the Annual Hawaii International Conference on System Sciences 2024;():7530-7539 IEEE Computer Society 2024 Ref ID: 4795 The use of generative AI to create text descriptions from graphs has mostly focused on knowledge graphs, which connect concepts using facts. In this work we explore the capability of large pretrained language models to generate text from causal graphs, where salient concepts are represented as nodes and causality is represented via directed, typed edges. The causal reasoning encoded in these graphs can support applications as diverse as healthcare or marketing. Using two publicly available causal graph datasets, we empirically investigate the performance of four GPT-3 models under various settings. Our results indicate that while causal text descriptions improve with training data, compared to fact-based graphs, they are harder to generate under zero-shot settings. Results further suggest that users of generative AI can deploy future applications faster since similar performances are obtained when training a model with only a few examples as compared to fine-tuning via a large curated dataset. © 2024 IEEE Computer Society. All rights reserved. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3701
-
Piantadosi 2022
Meaning without reference in large language models
arXiv 2022;(): 2022 Ref ID: 7570 The widespread success of large language models (LLMs) has been met with skepticism that they possess anything like human concepts or meanings. Contrary to claims that LLMs possess no meaning whatsoever, we argue that they likely capture important aspects of meaning, and moreover work in a way that approximates a compelling account of human cognition in which meaning arises from conceptual role. Because conceptual role is defined by the relationships between internal representational states, meaning cannot be determined from a model's architecture, training data, or objective function, but only by examination of how its internal states relate to each other. This approach may clarify why and how LLMs are so successful and suggest how they can be made more human-like. |
Kwesi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#883
-
Piat 2023
What does KnowBert-UMLS forget?
20th ACS/IEEE International Conference on Computer Systems and Applications (AICCSA) 2023;(): Giza, EGYPT Ieee 2023 DOI: 10.1109/aiccsa59173.2023.10479333 · Ref ID: 3060 Integrating a source of structured prior knowledge, such as a knowledge graph, into transformer-based language models is an increasingly popular method for increasing data efficiency and adapting them to a target domain. However, most methods for integrating structured knowledge into language models require additional training in order to adapt the model to the non-textual modality. This process typically leads to some amount of catastrophic forgetting on the general domain. KnowBert is one such knowledge integration method which can incorporate information from a variety of knowledge graphs to enhance the capabilities of transformer-based language models such as BERT. We conduct a qualitative analysis of the results of KnowBert-UMLS, a biomedically specialized KnowBert model, on a variety of linguistic tasks. Our results reveal that its increased understanding of biomedical concepts comes at the cost, specifically, of general common-sense knowledge and understanding of casual speech. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1358
-
Plenz 2024
Graph Language Models
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;1():4477-4494 Association for Computational Linguistics (ACL) 2024 Ref ID: 4399 While Language Models (LMs) are the workhorses of NLP, their interplay with structured knowledge graphs (KGs) is still actively researched. Current methods for encoding such graphs typically either (i) linearize them for embedding with LMs - which underutilize structural information, or (ii) use Graph Neural Networks (GNNs) to preserve the graph structure - but GNNs cannot represent text features as well as pretrained LMs. In our work we introduce a novel LM type, the Graph Language Model (GLM), that integrates the strengths of both approaches and mitigates their weaknesses. The GLM parameters are initialized from a pretrained LM to enhance understanding of individual graph concepts and triplets. Simultaneously, we design the GLM's architecture to incorporate graph biases, thereby promoting effective knowledge distribution within the graph. This enables GLMs to process graphs, texts, and interleaved inputs of both. Empirical evaluations on relation classification tasks show that GLM embeddings surpass both LM- and GNN-based baselines in supervised and zero-shot setting, demonstrating their versatility. © 2024 Association for Computational Linguistics. |
Mike
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2298
-
Pol 2023
A Data-Driven Approach for Modeling Unknown Multi-Scale Systems
2023 IEEE International Conference on Autonomic Computing and Self-Organizing Systems Companion (ACSOS-C) 2023;():35-40 2023 DOI: 10.1109/ACSOS-C58168.2023.00033 · Ref ID: 6501 Complex adaptive systems often organize via multiple abstraction levels, or ‘scales’, interconnected by feedback loops. This enables adaptation and survival in changing environments, while managing complexity with limited resources. For an external observer unaware of such multi-scale structure, modeling an unknown system may be a complicated endeavor. This position paper proposes a data-driven approach for addressing this issue. It generates multi-scale models from incomplete monitoring data, capitalizing on the behavioral regularities that stem from its feedback loops. It also defines the appropriate language elements for expressing these multi-scale models. We validate our approach on data obtained from a theoretical multi-scale system: a holonic cellular automata (HCA) simulator. Results show that the proposed approach can identify the HCA's three abstraction levels and main modeling concepts. This is an encouraging first step towards establishing automatic methods for multi-scale model discovery from partial observations. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3244
-
Porada 2019
Can a Gorilla Ride a Camel? Learning Semantic Plausibility from Text
arXiv 2019;(): 2019 Ref ID: 7382 Modeling semantic plausibility requires commonsense knowledge about the world and has been used as a testbed for exploring various knowledge representations. Previous work has focused specifically on modeling physical plausibility and shown that distributional methods fail when tested in a supervised setting. At the same time, distributional models, namely large pretrained language models, have led to improved results for many natural language understanding tasks. In this work, we show that these pretrained language models are in fact effective at modeling physical plausibility in the supervised setting. We therefore present the more difficult problem of learning to model physical plausibility directly from text. We create a training set by extracting attested events from a large corpus, and we provide a baseline for training on these attested events in a self-supervised manner and testing on a physical plausibility task. We believe results could be further improved by injecting explicit commonsense knowledge into a distributional model. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#823
-
Postiglione 2021
Towards an Italian Healthcare Knowledge Graph
14th International Conference on Similarity Search and Applications (SISA) 2021;13058():387-394 TU Dortmund, ELECTR NETWORK Springer International Publishing Ag 2021 DOI: 10.1007/978-3-030-89657-7_29 · Ref ID: 3086 Electronic Health Records (EHRs), Big Data, Knowledge Graphs (KGs) and machine learning can potentially be a great step towards the technological shift from the one size fit all medicine, where treatments are based on an equal protocol for all the patients, to the precision medicine, which takes count of all their individual information: lifestyle, preferences, health history, genomics, and so on. However, the lack of data which characterizes low-resource languages is a huge limitation for the application of the above-mentioned technologies. In this work, we will try to fill this gap by means of transformer language models and few-shot approaches and we will apply similarity-based deep learning techniques on the constructed KG for downstream applications. The proposed architecture is general and thus applicable to any low-resource language. |
Mike
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2632
-
Potts 2022
Leveraging Multiple Representations of Topic Models for Knowledge Discovery
Topic models are often useful in categorization of related documents in information retrieval and knowledge discovery systems, especially for large datasets. Interpreting the output of these models remains an ongoing challenge for the research community. The typical practice in the application of topic models is to tune the parameters of a chosen model for a target dataset and select the model with the best output based on a given metric. We present a novel perspective on topic analysis by presenting a process for combining output from multiple models with different theoretical underpinnings. We show that this results in our ability to tackle novel tasks such as semantic characterization of content that cannot be carried out by using single models. One example task is to characterize the differences between topics or documents in terms of their purpose and also importance with respect to the underlying output of the discovery algorithm. To show the potential benefit of leveraging multiple models we present an algorithm to map the term-space of Latent Dirichlet Allocation (LDA) to the neural document-embedding space of doc2vec. We also show that by utilizing both models in parallel and analyzing the resulting document distributions using the Normalized Pointwise Mutual Information (NPMI) metric we can gain insight into the purpose and importance of topics across models. This approach moves beyond topic identification to a richer characterization of the information and provides a better understanding of the complex relationships between these typically competing techniques. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#549
-
Pouramini 2024
Matching tasks to objectives: Fine-tuning and prompt-tuning strategies for encoder-decoder pre-trained language models
Prompt-based learning has emerged as a dominant paradigm in natural language processing. This study explores the impact of diverse pre-training objectives on the performance of encoder-decoder pre-trained language models across generation and question answering tasks, with a focus on commonsense knowledge retrieval and completion. We highlight the benefits of incorporating multiple objectives during both pre-training and fine-tuning stages. We introduce the Match Task to Objective (MTO) framework and methods for determining the appropriate objective for a given task. This framework offers automated methods to prepare task-related data for adaptation through unsupervised training, based on the identified objective. In the fine-tuning stage, we design novel templates that align with the objectives of the pre-training and adaptation stages. When aligned with task requirements, these strategies can achieve a performance gain of over 120% compared to conventional methods in few-shot settings. They significantly outperform related works in few-shot settings and exceed the baseline even in full-dataset scenarios. Furthermore, we extend this approach to include prompt-tuning methodologies, providing guidance for more effective soft prompt engineering and optimization. Our strategies significantly enhance prompt-tuning performance as well. These insights hold substantial value, precisely guiding the selection and optimization of models customized for specific tasks. Code is available at https://github.com/puraminy/MTO/ |
Ishan
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3295
-
Pradeep 2024
ConvKGYarn: Spinning Configurable and Scalable Conversational Knowledge Graph QA datasets with Large Language Models
arXiv 2024;(): 2024 Ref ID: 8524 The rapid advancement of Large Language Models (LLMs) and conversational assistants necessitates dynamic, scalable, and configurable conversational datasets for training and evaluation. These datasets must accommodate diverse user interaction modes, including text and voice, each presenting unique modeling challenges. Knowledge Graphs (KGs), with their structured and evolving nature, offer an ideal foundation for current and precise knowledge. Although human-curated KG-based conversational datasets exist, they struggle to keep pace with the rapidly changing user information needs. We present ConvKGYarn, a scalable method for generating up-to-date and configurable conversational KGQA datasets. Qualitative psychometric analyses confirm our method can generate high-quality datasets rivaling a popular conversational KGQA dataset while offering it at scale and covering a wide range of human-interaction configurations. We showcase its utility by testing LLMs on diverse conversations - exploring model behavior on conversational KGQA sets with different configurations grounded in the same KG fact set. Our results highlight the ability of ConvKGYarn to improve KGQA foundations and evaluate parametric knowledge of LLMs, thus offering a robust solution to the constantly evolving landscape of conversational assistants. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3923
-
Prasad 2024
Towards Development of Automated Knowledge Maps and Databases for Materials Engineering using Large Language Models
arXiv 2024;(): 2024 Ref ID: 8113 In this work a Large Language Model (LLM) based workflow is presented that utilizes OpenAI ChatGPT model GPT-3.5-turbo-1106 and Google Gemini Pro model to create summary of text, data and images from research articles. It is demonstrated that by using a series of processing, the key information can be arranged in tabular form and knowledge graphs to capture underlying concepts. Our method offers efficiency and comprehension, enabling researchers to extract insights more effectively. Evaluation based on a diverse Scientific Paper Collection demonstrates our approach in facilitating discovery of knowledge. This work contributes to accelerated material design by smart literature review. The method has been tested based on various qualitative and quantitative measures of gathered information. The ChatGPT model achieved an F1 score of 0.40 for an exact match (ROUGE-1, ROUGE-2) but an impressive 0.479 for a relaxed match (ROUGE-L, ROUGE-Lsum) structural data format in performance evaluation. The Google Gemini Pro outperforms ChatGPT with an F1 score of 0.50 for an exact match and 0.63 for a relaxed match. This method facilitates high-throughput development of a database relevant to materials informatics. For demonstration, an example of data extraction and knowledge graph formation based on a manuscript about a titanium alloy is discussed. |
Mike
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3186
-
Priyanshu 2023
Are Chatbots Ready for Privacy-Sensitive Applications? An Investigation into Input Regurgitation and Prompt-Induced Sanitization
arXiv 2023;(): 2023 Ref ID: 7726 LLM-powered chatbots are becoming widely adopted in applications such as healthcare, personal assistants, industry hiring decisions, etc. In many of these cases, chatbots are fed sensitive, personal information in their prompts, as samples for in-context learning, retrieved records from a database, or as part of the conversation. The information provided in the prompt could directly appear in the output, which might have privacy ramifications if there is sensitive information there. As such, in this paper, we aim to understand the input copying and regurgitation capabilities of these models during inference and how they can be directly instructed to limit this copying by complying with regulations such as HIPAA and GDPR, based on their internal knowledge of them. More specifically, we find that when ChatGPT is prompted to summarize cover letters of a 100 candidates, it would retain personally identifiable information (PII) verbatim in 57.4% of cases, and we find this retention to be non-uniform between different subgroups of people, based on attributes such as gender identity. We then probe ChatGPT's perception of privacy-related policies and privatization mechanisms by directly instructing it to provide compliant outputs and observe a significant omission of PII from output. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3672
-
Puchert 2023
LLMMaps – A Visual Metaphor for Stratified Evaluation of Large Language Models
arXiv 2023;(): 2023 Ref ID: 7667 Large Language Models (LLMs) have revolutionized natural language processing and demonstrated impressive capabilities in various tasks. Unfortunately, they are prone to hallucinations, where the model exposes incorrect or false information in its responses, which renders diligent evaluation approaches mandatory. While LLM performance in specific knowledge fields is often evaluated based on question and answer (Q&A) datasets, such evaluations usually report only a single accuracy number for the dataset, which often covers an entire field. This field-based evaluation, is problematic with respect to transparency and model improvement. A stratified evaluation could instead reveal subfields, where hallucinations are more likely to occur and thus help to better assess LLMs' risks and guide their further development. To support such stratified evaluations, we propose LLMMaps as a novel visualization technique that enables users to evaluate LLMs' performance with respect to Q&A datasets. LLMMaps provide detailed insights into LLMs' knowledge capabilities in different subfields, by transforming Q&A datasets as well as LLM responses into an internal knowledge structure. An extension for comparative visualization furthermore, allows for the detailed comparison of multiple LLMs. To assess LLMMaps we use them to conduct a comparative analysis of several state-of-the-art LLMs, such as BLOOM, GPT-2, GPT-3, ChatGPT and LLaMa-13B, as well as two qualitative user evaluations. All necessary source code and data for generating LLMMaps to be used in scientific publications and elsewhere is available on GitHub: https://github.com/viscom-ulm/LLMMaps |
Davis
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#579
-
Putman 2023
The Monarch Initiative in 2024: an analytic platform integrating phenotypes, genes and diseases across species
Bridging the gap between genetic variations, environmental determinants, and phenotypic outcomes is critical for supporting clinical diagnosis and understanding mechanisms of diseases. It requires integrating open data at a global scale. The Monarch Initiative advances these goals by developing open ontologies, semantic data models, and knowledge graphs for translational research. The Monarch App is an integrated platform combining data about genes, phenotypes, and diseases across species. Monarch's APIs enable access to carefully curated datasets and advanced analysis tools that support the understanding and diagnosis of disease for diverse applications such as variant prioritization, deep phenotyping, and patient profile-matching. We have migrated our system into a scalable, cloud-based infrastructure; simplified Monarch's data ingestion and knowledge graph integration systems; enhanced data mapping and integration standards; and developed a new user interface with novel search and graph navigation features. Furthermore, we advanced Monarch's analytic tools by developing a customized plugin for OpenAI's ChatGPT to increase the reliability of its responses about phenotypic data, allowing us to interrogate the knowledge in the Monarch graph using state-of-the-art Large Language Models. The resources of the Monarch Initiative can be found at monarchinitiative.org and its corresponding code repository at github.com/monarch-initiative/monarch-app. Graphical Abstract |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1986
-
Qi 2023
Traditional Chinese Medicine Prescription Recommendation Model Based on Large Language Models and Graph Neural Networks
Proceedings - 2023 2023 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2023 2023;():4623-4627 Institute of Electrical and Electronics Engineers Inc. 2023 DOI: 10.1109/BIBM58861.2023.10385489 · Ref ID: 4970 Background: Traditional Chinese medicine (TCM) has a millennia-long history, offering unique treatments and insights into global health. Given the intricate symptoms and shifting syndrome patterns, prescribing can be tough for young doctors. TCM prescription recommendations can help these doctors address their experience gap. In recent years, with advancements in technologies such as artificial intelligence and big data, intelligent recommendations for TCM prescriptions have become feasible, holding significant implications for enhancing treatment efficacy and optimizing patient experience. Objective: This study aims to establish a novel TCM prescription recommendation model by integrating large language models with Graph Neural Network (GNN) to enhance the accuracy of prescription suggestions. Method: Based on the co-occurrence of symptoms and herbal medicines, we constructed symptom graphs, symptom-herb graphs, and herb-herb graphs. Using Graph Convolutional Network (GCN), we acquired embeddings for both symptoms and herbs. The symptom embeddings are then integrated with insights from large language model embeddings, while auxiliary information from an external knowledge graph is incorporated into the herb embeddings. A final list of herb recommendations was generated by interacting with the embeddings of symptoms and herbs. Results: The proposed algorithm achieved 22.1%, 17.2%, and 13% on the evaluation metrics P@5, P@10, and P@20, respectively. Concurrently, scores for R@5, R@10, and R@20 were 14%, 24%, and 32.5%, respectively. The P@5 metric surpassed the KDHR by 4.7%, and the R@20 metric exceeded the KDHR by 6%. Overall, the performance of our model outperformed other baseline models across various evaluation criteria. Conclusion: The TCM prescription recommendation model, infused with information from a large language model, can effectively enhance the outcomes of TCM prescription recommendations. The study may offer valuable insights for auxiliary clinical research and treatment in TCM. © 2023 IEEE. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3854
-
Qi 2024
Safety Control of Service Robots with LLMs and Embodied Knowledge Graphs
arXiv 2024;(): 2024 Ref ID: 8324 Safety limitations in service robotics across various industries have raised significant concerns about the need for robust mechanisms ensuring that robots adhere to safe practices, thereby preventing actions that might harm humans or cause property damage. Despite advances, including the integration of Knowledge Graphs (KGs) with Large Language Models (LLMs), challenges in ensuring consistent safety in autonomous robot actions persist. In this paper, we propose a novel integration of Large Language Models with Embodied Robotic Control Prompts (ERCPs) and Embodied Knowledge Graphs (EKGs) to enhance the safety framework for service robots. ERCPs are designed as predefined instructions that ensure LLMs generate safe and precise responses. These responses are subsequently validated by EKGs, which provide a comprehensive knowledge base ensuring that the actions of the robot are continuously aligned with safety protocols, thereby promoting safer operational practices in varied contexts. Our experimental setup involved diverse real-world tasks, where robots equipped with our framework demonstrated significantly higher compliance with safety standards compared to traditional methods. This integration fosters secure human-robot interactions and positions our methodology at the forefront of AI-driven safety innovations in service robotics. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3436
-
Qi 2023
FoodGPT: A Large Language Model in Food Testing Domain with Incremental Pre-training and Knowledge Graph Prompt
arXiv 2023;(): 2023 Ref ID: 7810 Currently, the construction of large language models in specific domains is done by fine-tuning on a base model. Some models also incorporate knowledge bases without the need for pre-training. This is because the base model already contains domain-specific knowledge during the pre-training process. We build a large language model for food testing. Unlike the above approach, a significant amount of data in this domain exists in Scanning format for domain standard documents. In addition, there is a large amount of untrained structured knowledge. Therefore, we introduce an incremental pre-training step to inject this knowledge into a large language model. In this paper, we propose a method for handling structured knowledge and scanned documents in incremental pre-training. To overcome the problem of machine hallucination, we constructe a knowledge graph to serve as an external knowledge base for supporting retrieval in the large language model. It is worth mentioning that this paper is a technical report of our pre-release version, and we will report our specific experimental data in future versions. |
Xinchen
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3708
-
Qian 2023
"Merge Conflicts!" Exploring the Impacts of External Distractors to Parametric Knowledge Graphs
arXiv 2023;(): 2023 Ref ID: 7834 Large language models (LLMs) acquire extensive knowledge during pre-training, known as their parametric knowledge. However, in order to remain up-to-date and align with human instructions, LLMs inevitably require external knowledge during their interactions with users. This raises a crucial question: How will LLMs respond when external knowledge interferes with their parametric knowledge? To investigate this question, we propose a framework that systematically elicits LLM parametric knowledge and introduces external knowledge. Specifically, we uncover the impacts by constructing a parametric knowledge graph to reveal the different knowledge structures of LLMs, and introduce external knowledge through distractors of varying degrees, methods, positions, and formats. Our experiments on both black-box and open-source models demonstrate that LLMs tend to produce responses that deviate from their parametric knowledge, particularly when they encounter direct conflicts or confounding changes of information within detailed contexts. We also find that while LLMs are sensitive to the veracity of external knowledge, they can still be distracted by unrelated information. These findings highlight the risk of hallucination when integrating external knowledge, even indirectly, during interactions with current LLMs. All the data and results are publicly available. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1267
-
Qian 2023
Enhancing Text Comprehension via Fusing Pre-trained Language Model with Knowledge Graph
ACM International Conference Proceeding Series 2023;():353-360 Association for Computing Machinery 2023 DOI: 10.1145/3639631.3639689 · Ref ID: 4753 Pre-trained language models (PLMs) such as BERT and GPTs capture rich linguistic and syntactic knowledge from pre-training over large-scale text corpora, which can be further fine-tuned for specific downstream tasks. However, these models still have limitations as they rely on knowledge gained from plain text and ignore structured knowledge such as knowledge graphs (KGs). Recently, there has been a growing trend of explicitly integrating KGs into PLMs to improve their performance. For instance, K-BERT incorporates KG triples as domain-specific supplements into input sentences. Nevertheless, we have observed that such methods do not consider the semantic relevance between the introduced knowledge and the original input sentence, leading to the issue of knowledge impurities. To address this issue, we propose a semantic matching-based approach that enriches the input text with knowledge extracted from an external KG. The architecture of our model comprises three components: the knowledge retriever (KR), the knowledge injector (KI), and the knowledge aggregator (KA). The KR, built upon the sentence representation learning model (i.e. CoSENT), retrieves triples with high semantic relevance to the input sentence from an external KG to alleviate the issue of knowledge impurities. The KI then integrates the retrieved triples from the KR into the input text by converting the original sentence into a knowledge tree with multiple branches, the knowledge tree is transformed into an accessible sequence of text that can be fed into the KA. Finally, the KA takes the flattened knowledge tree and passes it through an embedding layer and a masked Transformer encoder. We conducted extensive evaluations on eight datasets covering five text comprehension tasks, and the experimental results demonstrate that our approach exhibits competitive advantages over popular knowledge-enhanced PLMs such as K-BERT and ERNIE. © 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#397
-
Qiao 2022
A joint model for entity and relation extraction based on BERT
In recent years, as the knowledge graph has attained significant achievements in many specific fields, which has become one of the core driving forces for the development of the internet and artificial intelligence. However, there is no mature knowledge graph in the field of agriculture, so it is a great significance study on the construction technology of agricultural knowledge graph. Named entity recognition and relation extraction are key steps in the construction of knowledge graph. In this paper, based on the joint extraction model LSTM-LSTM-Bias brought in BERT pre-training language model to proposed a agricultural entity relationship joint extraction model BERT-BILSTM-LSTM which is applied to the standard data set NYT and self-built agricultural data set AgriRelation. Experimental results showed that the model can effectively extracted the relationship between agricultural entities and entities. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1626
-
Qiu 2024
Matching Tabular Data to Knowledge Graph with Effective Core Column Set Discovery
Matching tabular data to a knowledge graph (KG) is critical for understanding the semantic column types, column relationships, and entities of a table. Existing matching approaches rely heavily on core columns that represent primary subject entities on which other columns in the table depend. However, discovering these core columns before understanding the table's semantics is challenging. Most prior works use heuristic rules, such as the leftmost column, to discover a single core column, while an insightful discovery of the core column set that accurately captures the dependencies between columns is often overlooked. To address these challenges, we introduce Dependency-aware Core Column Set Discovery (DaCo), an iterative method that uses a novel rough matching strategy to identify both inter-column dependencies and the core column set. Additionally, DaCo can be seamlessly integrated with pre-trained language models, as proposed in the optimization module. Unlike other methods, DaCo does not require labeled data or contextual information, making it suitable for real-world scenarios. In addition, it can identify multiple core columns within a table, which is common in real-world tables. We conduct experiments on six datasets, including five datasets with single core columns and one dataset with multiple core columns. Our experimental results show that DaCo outperforms existing core column set detection methods, further improving the effectiveness of table understanding tasks. © 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#185
-
Qiu 2023
DOCUMENT UNDERSTANDING-BASED DESIGN SUPPORT: LANGUAGE MODEL BASED DESIGN KNOWLEDGE EXTRACTION
ASME International Design Engineering Technical Conferences / Computers and Information in Engineering Conference (IDETC-CIE) / 49th Design Automation Conference (DAC) 2023;(): Boston, MA Amer Soc Mechanical Engineers 2023 Ref ID: 3799 Design knowledge in the vast amount of design reports and documents can be a great resource for designers in their practice. However, capturing such domain-specific information embedded in long-length unstructured texts is always time-consuming and sometimes difficult. Therefore, it is highly desirable for a computer system to automatically extract the main knowledge points and their corresponding inner structures from given documents. In this study of document understanding for design support (DocUDS), a design-perspective knowledge extraction approach is proposed that uses phrase-level domain-specific labeled datasets to finetune a Bidirectional Encoder Representation from Transformers (BERT) model so that it can extract design knowledge from documents. The BERT model finetuning attempts to blend in the domain-specific knowledge of well-recognized domain concepts and is based on the datasets generated from design reports. The model is utilized to map the captured sentences to the main design entities <requirement>, <function>, and <solution>. In addition, this approach uncovers inner relationships among the sentences and constructs overall structures of documents to enhance understanding. The definitions of design perspectives, inter-perspective relations, and intra-perspective relations are introduced, which together capture the main design knowledge points and their relations and constitute an understanding of the design domain knowledge of a text. The case study results have demonstrated the proposed approach's effectiveness in understanding and extracting relevant design knowledge points. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#755
-
Quiroz-Mercado 2020
Semantic Similarity Estimation Using Vector Symbolic Architectures
For many natural language processing applications, estimating similarity and relatedness between words are key tasks that serve as the basis for classification and generalization. Currently, vector semantic models (VSM) have become a fundamental language modeling tool. VSMs represent words as points in a high-dimensional space and follow the distributional hypothesis of meaning, which assumes that semantic similarity is related to the context. In this paper, we propose a model whose representations are based on the semantic features associated with a concept within the ConceptNet knowledge graph. The proposed model is based on a vector symbolic architecture framework, which defines a set of arithmetic operations to encode the semantic features within a single high-dimensional vector. In addition to word distribution, these vector representations consider several types of information. Moreover, owing to the properties of high-dimensional spaces, they have the additional advantage of being interpretable. We analyze the model's performance on the SimLex-999 dataset, a dataset where commonly used distributional models (e.g., word2vec or GloVe) perform poorly. Our results are similar to those of other hybrid models, and they surpass several state-of-the-art distributional and knowledge-based models. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3431
-
Rabby 2024
Fine-tuning and Prompt Engineering with Cognitive Knowledge Graphs for Scholarly Knowledge Organization
arXiv 2024;(): 2024 Ref ID: 8587 The increasing amount of published scholarly articles, exceeding 2.5 million yearly, raises the challenge for researchers in following scientific progress. Integrating the contributions from scholarly articles into a novel type of cognitive knowledge graph (CKG) will be a crucial element for accessing and organizing scholarly knowledge, surpassing the insights provided by titles and abstracts. This research focuses on effectively conveying structured scholarly knowledge by utilizing large language models (LLMs) to categorize scholarly articles and describe their contributions in a structured and comparable manner. While previous studies explored language models within specific research domains, the extensive domain-independent knowledge captured by LLMs offers a substantial opportunity for generating structured contribution descriptions as CKGs. Additionally, LLMs offer customizable pathways through prompt engineering or fine-tuning, thus facilitating to leveraging of smaller LLMs known for their efficiency, cost-effectiveness, and environmental considerations. Our methodology involves harnessing LLM knowledge, and complementing it with domain expert-verified scholarly data sourced from a CKG. This strategic fusion significantly enhances LLM performance, especially in tasks like scholarly article categorization and predicate recommendation. Our method involves fine-tuning LLMs with CKG knowledge and additionally injecting knowledge from a CKG with a novel prompting technique significantly increasing the accuracy of scholarly knowledge extraction. We integrated our approach in the Open Research Knowledge Graph (ORKG), thus enabling precise access to organized scholarly knowledge, crucially benefiting domain-independent scholarly knowledge exchange and dissemination among policymakers, industrial practitioners, and the general public. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3274
-
Radha 2024
Composite Learning Units: Generalized Learning Beyond Parameter Updates to Transform LLMs into Adaptive Reasoners
arXiv 2024;(): 2024 Ref ID: 8685 Human learning thrives on the ability to learn from mistakes, adapt through feedback, and refine understanding-processes often missing in static machine learning models. In this work, we introduce Composite Learning Units (CLUs) designed to transform reasoners, such as Large Language Models (LLMs), into learners capable of generalized, continuous learning without conventional parameter updates while enhancing their reasoning abilities through continual interaction and feedback. CLUs are built on an architecture that allows a reasoning model to maintain and evolve a dynamic knowledge repository: a General Knowledge Space for broad, reusable insights and a Prompt-Specific Knowledge Space for task-specific learning. Through goal-driven interactions, CLUs iteratively refine these knowledge spaces, enabling the system to adapt dynamically to complex tasks, extract nuanced insights, and build upon past experiences autonomously. We demonstrate CLUs' effectiveness through a cryptographic reasoning task, where they continuously evolve their understanding through feedback to uncover hidden transformation rules. While conventional models struggle to grasp underlying logic, CLUs excel by engaging in an iterative, goal-oriented process. Specialized components-handling knowledge retrieval, prompt generation, and feedback analysis-work together within a reinforcing feedback loop. This approach allows CLUs to retain the memory of past failures and successes, adapt autonomously, and apply sophisticated reasoning effectively, continually learning from mistakes while also building on breakthroughs. |
Kwesi
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1026
-
Rajpal 2023
BERTologyNavigator: Advanced Question Answering with BERT-based Semantics
CEUR Workshop Proceedings 2023;3592(): CEUR-WS 2023 Ref ID: 5094 The development and integration of knowledge graphs and language models has significance in artificial intelligence and natural language processing. In this study, we introduce the BERTologyNavigator- a two-phased system that combines relation extraction techniques and BERT embeddings to navigate the relationships within the DBLP Knowledge Graph (KG). Our approach focuses on extracting one-hop relations and labelled candidate pairs in the first phases. This is followed by employing BERT's CLS embeddings and additional heuristics for relation selection in the second phase. Our system reaches an F1 score of 0.2175 on the DBLP QuAD Final test dataset for Scholarly QALD and 0.98 F1 score on the subset of the DBLP QuAD test dataset during the QA phase. © 2023 CEUR-WS. All rights reserved. |
Mike
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#42
-
Ramchand 2024
Augmenting Infrequent Relationships in Clinical Language Models with Graph-Encoded Hierarchical Ontologies
1st International Conference on Artificial Intelligence in Healthcare (AIiH) 2024;14975():31-44 Swansea, ENGLAND Springer International Publishing Ag 2024 DOI: 10.1007/978-3-031-67278-1_3 · Ref ID: 3195 Harnessing primary-care data can facilitate earlier clinical interventions via predictive modelling. Nonetheless, the intricacy of medical terminology and the breadth of ontological data often obscure the inner workings of such models. Despite the growing complexity of artificial intelligence methodologies and the pressing demand for medical tools that seamlessly integrate into clinical workflows, this opacity persists. We propose enhancing clinical Bidirectional Encoder Representations from Transformers (BERT) models with graph attention networks that encode diagnosis and medication concept hierarchies derived from primary care data. In 10-fold cross-validation on cardiovascular and respiratory detection tasks, our graph-enhanced model marginally improves F1 performance over baseline BERT. More importantly, our approach surfaces clinically deterministic patterns in patient groups, provides modular visualisations of influential terminal and ancestral medical concepts, and improves clustering of related conditions. Additionally, the hierarchical encoding allows quantitative analysis of edge relevance within and across diagnosis and medical ontologies. Our research shows that injecting structured knowledge graphs into language model architectures can improve performance through domain-specific regularisation. Additionally, the use of class activation maps throughout the approach allows for richer interpretations of predictions by following activation flows along concept relationships. The dual utility of precise ontology encoding and Large Language Models makes our graph-injected clinical language model more accurate and trustworthy, propelling preventive precision medicine forward. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3878
-
Rangel 2024
SPARQL Generation: an analysis on fine-tuning OpenLLaMA for Question Answering over a Life Science Knowledge Graph
arXiv 2024;(): 2024 Ref ID: 8079 The recent success of Large Language Models (LLM) in a wide range of Natural Language Processing applications opens the path towards novel Question Answering Systems over Knowledge Graphs leveraging LLMs. However, one of the main obstacles preventing their implementation is the scarcity of training data for the task of translating questions into corresponding SPARQL queries, particularly in the case of domain-specific KGs. To overcome this challenge, in this study, we evaluate several strategies for fine-tuning the OpenLlama LLM for question answering over life science knowledge graphs. In particular, we propose an end-to-end data augmentation approach for extending a set of existing queries over a given knowledge graph towards a larger dataset of semantically enriched question-to-SPARQL query pairs, enabling fine-tuning even for datasets where these pairs are scarce. In this context, we also investigate the role of semantic "clues" in the queries, such as meaningful variable names and inline comments. Finally, we evaluate our approach over the real-world Bgee gene expression knowledge graph and we show that semantic clues can improve model performance by up to 33% compared to a baseline with random variable names and no comments included. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#52
-
Rawsthorne 2023
Automatic Nested Spatial Entity and Spatial Relation Extraction From Text for Knowledge Graph Creation: A Baseline Approach and a Benchmark Dataset
7th ACM SIGSPATIAL International Workshop on Geospatial Humanities (GeoHumanities) 2023;():21-30 Hamburg, GERMANY Assoc Computing Machinery 2023 DOI: 10.1145/3615887.3627754 · Ref ID: 3264 Automatically extracting geographic information from text is the key to harnessing the vast amount of spatial knowledge that only exists in this unstructured form. The fundamental elements of spatial knowledge include spatial entities, their types and the spatial relations between them. Structuring the spatial knowledge contained within text as a geospatial knowledge graph, and disambiguating the spatial entities, significantly facilitates its reuse. The automatic extraction of geographic information from text also allows the creation or enrichment of gazetteers. We propose a baseline approach for nested spatial entity and binary spatial relation extraction from text, a new annotated French-language benchmark dataset on the maritime domain that can be used to train algorithms for both extraction tasks, and benchmark results for the two tasks carried out individually and end-to-end. Our approach involves applying the Princeton University Relation Extraction system (PURE), made for flat, generic entity extraction and generic binary relation extraction, to the extraction of nested, spatial entities and spatial binary relations. By extracting nested spatial entities and the spatial relations between them, we have more information to aid entity disambiguation. In our experiments we compare the performance of a pretrained monolingual French BERT language model with that of a pretrained multilingual BERT language model, and study the effect of including cross-sentence context. Our results reveal very similar results for both models, although the multilingual model performs slightly better in entity extraction, and the monolingual model has slightly better relation extraction and end-to-end performances. We observe that increasing the amount of cross-sentence context improves the results for entity extraction whereas it has the opposite effect on relation extraction. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1997
-
Rawte 2024
Tutorial Proposal: Hallucination in Large Language Models
2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Tutorial Summaries 2024;():68-72 European Language Resources Association (ELRA) 2024 Ref ID: 4651 In the fast-paced domain of Large Language Models (LLMs), the issue of hallucination is a prominent challenge. Despite continuous endeavors to address this concern, it remains a highly active area of research within the LLM landscape. Grasping the intricacies of this problem can be daunting, especially for those new to the field. This tutorial aims to bridge this knowledge gap by introducing the emerging realm of hallucination in LLMs. It will comprehensively explore the key aspects of hallucination, including benchmarking, detection, and mitigation techniques. Furthermore, we will delve into the specific constraints and shortcomings of current approaches, providing valuable insights to guide future research efforts for participants. © 2024 ELRA Language Resource Association: CC BY-NC 4.0. |
Mike
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#354
-
Razouk 2023
Improving FMEA Comprehensibility via Common-Sense Knowledge Graph Completion Techniques
The Failure Mode Effect Analysis process (FMEA) is widely used in industry for risk assessment, as it effectively captures and documents domain-specific knowledge. This process is mainly concerned with causal domain knowledge. In practical applications, FMEAs encounter challenges in terms of comprehensibility, particularly related to inadequate coverage of listed failure modes and their corresponding effects and causes. This can be attributed to the limitations of traditional brainstorming approaches typically employed in the FMEA process. Depending on the size and diversity in terms of disciplines of the team conducting the analysis, these approaches may not adequately capture a comprehensive range of failure modes, leading to gaps in coverage. To this end, methods for improving FMEA knowledge comprehensibility are highly needed. A potential approach to address this gap is rooted in recent advances in common-sense knowledge graph completion, which have demonstrated the effectiveness of text-aware graph embedding techniques. However, the applicability of such methods in an industrial setting is limited. This paper addresses this issue on FMEA documents in an industrial environment. Here, the application of common-sense knowledge graph completion methods on FMEA documents from semiconductor manufacturing is studied. These methods achieve over 20% MRR on the test set and 70% of the top 10 predictions were manually assessed to be plausible by domain experts. Based on the evaluation, this paper confirms that text-aware knowledge graph embedding for common-sense knowledge graph completion are more effective than structure-only knowledge graph embedding for improving FMEA knowledge comprehensibility. Additionally we found that language model in domain fine-tuning is beneficial for extracting more meaningful embedding, thus improving the overall model performance. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1347
-
Regino 2024
Generating E-commerce Related Knowledge Graph from Text: Open Challenges and Early Results using LLMs
CEUR Workshop Proceedings 2024;3747():18 CEUR-WS 2024 Ref ID: 4360 E-commerce systems need to use and manage vast amounts of unstructured textual data. This poses significant challenges for knowledge representation, information retrieval, and recommendation tasks. This study investigates the generation of E-commerce-related Knowledge Graphs (KGs) from text. In particular, we explore using Large Language Models (LLMs). Our approach integrates ontology with text-based examples from existing KGs via prompts to create structured RDF triples. We outline a four-step method encompassing text classification, extracting relevant characteristics, generating RDF triples, and assessing the generated triples. Each step leverages LLM instructions to process unstructured text. We discuss the insights, challenges, and potential future directions, highlighting the significance of integrating ontology elements with unstructured text for generating semantically enriched KGs. Through case experimentations, we demonstrate the effectiveness and applicability of our solution in the E-commerce domain. © 2024 Copyright for this paper by its authors. |
Ishan
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2800
-
Rehmat 2020
Predicting the pathogenicity of protein coding mutations using Natural Language Processing
2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC) 2020;():5842-5846 2020 DOI: 10.1109/EMBC44109.2020.9175781 · Ref ID: 6681 DNA-Sequencing of tumor cells has revealed thousands of genetic mutations. However, cancer is caused by only some of them. Identifying mutations that contribute to tumor growth from neutral ones is extremely challenging and is currently carried out manually. This manual annotation is very cumbersome and expensive in terms of time and money. In this study, we introduce a novel method "NLP-SNPPred" to read scientific literature and learn the implicit features that cause certain variations to be pathogenic. Precisely, our method ingests the bio-medical literature and produces its vector representation via exploiting state of the art NLP methods like sent2vec, word2vec and tf-idf. These representations are then fed to machine learning predictors to identify the pathogenic versus neutral variations. Our best model (NLPSNPPred) trained on OncoKB and evaluated on several publicly available benchmark datasets, outperformed state of the art function prediction methods. Our results show that NLP can be used effectively in predicting functional impact of protein coding variations with minimal complementary biological features. Moreover, encoding biological knowledge into the right representations, combined with machine learning methods can help in automating manual efforts. A free to use web-server is available at http://www.nlp-snppred.cbrlab.org. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1391
-
Ren 2024
Identifying Semantic Induction Heads to Understand In-Context Learning
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():6916-6932 Association for Computational Linguistics (ACL) 2024 Ref ID: 4530 Although large language models (LLMs) have demonstrated remarkable performance, the lack of transparency in their inference logic raises concerns about their trustworthiness. To gain a better understanding of LLMs, we conduct a detailed analysis of the operations of attention heads and aim to better understand the in-context learning of LLMs. Specifically, we investigate whether attention heads encode two types of relationships between tokens in natural languages: the syntactic dependency parsed from sentences and the relation within knowledge graphs. We find that certain attention heads exhibit a pattern where, when attending to head tokens, they recall tail tokens and increase the output logits of those tail tokens. More crucially, the formulation of such semantic induction heads has a close correlation with the emergence of the in-context learning ability of language models. The study of semantic attention heads advances our understanding of the intricate operations of attention heads in transformers, and further provides new insights into the in-context learning of LLMs. © 2024 Association for Computational Linguistics. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2121
-
Ren 2020
API-Misuse Detection Driven by Fine-Grained API-Constraint Knowledge Graph
2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE) 2020;():461-472 2020 Ref ID: 6115 API misuses cause significant problem in software development. Existing methods detect API misuses against frequent API usage patterns mined from codebase. They make a naive assumption that API usage that deviates from the most-frequent API usage is a misuse. However, there is a big knowledge gap between API usage patterns and API usage caveats in terms of comprehensiveness, explainability and best practices. In this work, we propose a novel approach that detects API misuses directly against the API caveat knowledge, rather than API usage patterns. We develop open information extraction methods to construct a novel API-constraint knowledge graph from API reference documentation. This knowledge graph explicitly models two types of API-constraint relations (call-order and condition-checking) and enriches return and throw relations with return conditions and exception triggers. It empowers the detection of three types of frequent API misuses - missing calls, missing condition checking and missing exception handling, while existing detectors mostly focus on only missing calls. As a proof-of-concept, we apply our approach to Java SDK API Specification. Our evaluation confirms the high accuracy of the extracted API-constraint relations. Our knowledge-driven API misuse detector achieves 0.60 (68/113) precision and 0.28 (68/239) recall for detecting Java API misuses in the API misuse benchmark MuBench. This performance is significantly higher than that of existing pattern-based API misused detectors. A pilot user study with 12 developers shows that our knowledge-driven API misuse detection is very promising in helping developers avoid API misuses and debug the bugs caused by API misuses. |
Mike
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3558
-
Ren 2023
Joint Semantic and Structural Representation Learning for Enhancing User Preference Modelling
arXiv 2023;(): 2023 Ref ID: 7682 Knowledge graphs (KGs) have become important auxiliary information for helping recommender systems obtain a good understanding of user preferences. Despite recent advances in KG-based recommender systems, existing methods are prone to suboptimal performance due to the following two drawbacks: 1) current KG-based methods over-emphasize the heterogeneous structural information within a KG and overlook the underlying semantics of its connections, hindering the recommender from distilling the explicit user preferences; and 2) the inherent incompleteness of a KG (i.e., missing facts, relations and entities) will deteriorate the information extracted from KG and weaken the representation learning of recommender systems. To tackle the aforementioned problems, we investigate the potential of jointly incorporating the structural and semantic information within a KG to model user preferences in finer granularity. A new framework for KG-based recommender systems, namely ?nowledge ?nfomax ?ecommender ?ystem with ?ontrastive ?earning (KIRS-CL) is proposed in this paper. Distinct from previous KG-based approaches, KIRS-CL utilizes structural and connectivity information with high-quality item embeddings learned by encoding KG triples with a pre-trained language model. These well-trained entity representations enable KIRS-CL to find the item to recommend via the preference connection between the user and the item. Additionally, to improve the generalizability of our framework, we introduce a contrastive warm-up learning strategy, making it capable of dealing with both warm- and cold-start recommendation scenarios. Extensive experiments on two real-world datasets demonstrate remarkable improvements over state-of-the-art baselines. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#272
-
Ren 2021
Fake News Detection on News-Oriented Heterogeneous Information Networks through Hierarchical Graph Attention
International Joint Conference on Neural Networks (IJCNN) 2021;(): Electr Network Ieee 2021 DOI: 10.1109/ijcnn52387.2021.9534362 · Ref ID: 3690 The viral spread of fake news has caused great social harm, making fake news detection an urgent task. Current fake news detection methods rely heavily on text information by learning the extracted news content or writing style of internal knowledge. However, deliberate rumors can mask writing style, bypassing language models and invalidating simple text-based models. In fact, news articles and other related components (such as news creators and news topics) can be modeled as a heterogeneous information network (HIN for short). In this paper, we propose a novel fake news detection framework, namely Hierarchical Graph Attention Network (HGAT), which uses a novel hierarchical attention mechanism to perform node representation learning in HIN, and then detects fake news by classifying news article nodes. Experiments on two real-world fake news datasets show that HGAT can outperform text-based models and other network-based models. In addition, the experiments prove the expandability and generalizability of our for graph representation learning and other node classification related applications in heterogeneous graphs. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1969
-
Ren 2023
Towards Informative Open-ended Text Generation with Dynamic Knowledge Triples
Findings of the Association for Computational Linguistics: EMNLP 2023 2023;():3189-3203 Association for Computational Linguistics (ACL) 2023 Ref ID: 5069 Pretrained language models (PLMs), especially large language models (LLMs) demonstrate impressive capabilities in open-ended text generation. While our statistical results show that LLMs often suffer from over-concentrated information, where the generated texts overly focus on the given prompt and fail to provide sufficient background and detailed information as humans do. To address this issue, we propose a dynamic knowledge-guided informative open-ended text generation approach, that utilizes a knowledge graph to help the model generate more contextually related entities and detailed facts. Specifically, we first employ a local knowledge filter to extract relevant knowledge from the comprehensive knowledge graph for a given topic sentence. Then we introduce a dynamic knowledge selector to predict the entity to be mentioned in the subsequent sentence. Finally, we utilize a knowledge-enhanced text generator to produce a more informative output. To evaluate the effectiveness of our approach, we evaluate the proposed approach in two scenarios: fine-tuning for small PLMs and prompt tuning for LLMs. Experimental results show that our approach could generate more informative texts than baselines. © 2023 Association for Computational Linguistics. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1275
-
Riaz 2023
Entity Typing with Triples Using Language Models
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2023;13998 LNCS():169-173 Springer Science and Business Media Deutschland GmbH 2023 DOI: 10.1007/978-3-031-43458-7_32 · Ref ID: 5143 Entity Typing is the task of assigning a type to an entity in a knowledge graph. In this paper, we propose ETwT (Entity Typing with Triples), which leverages the triples of an entity, namely its label, description and the property labels used on it. We analyse which language models and classifiers are best suited to this input and compare ETwT’s performance on coarse-grained and fine-grained entity typing. Our evaluation demonstrates that ETwT is able to predict coarse-grained entity types with an F $$:1$$ score of 0.994, outperforming three baselines. © The Author(s), under exclusive license to Springer Nature Switzerland AG. 2023. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2027
-
Ringwald 2024
Well-Written Knowledge Graphs: Most Effective RDF Syntaxes for Triple Linearization in End-to-End Extraction of Relations from Texts
Proceedings of the AAAI Conference on Artificial Intelligence 2024;38():23631-23632 Association for the Advancement of Artificial Intelligence 2024 DOI: 10.1609/aaai.v38i21.30502 · Ref ID: 4049 Seq-to-seq generative models recently gained attention for solving the relation extraction task. By approaching this problem as an end-to-end task, they surpassed encoder-based-only models. Little research investigated the effects of the output syntaxes on the training process of these models. Moreover, a limited number of approaches were proposed for extracting ready-to-load knowledge graphs following the RDF standard. In this paper, we consider that a set of triples can be linearized in many different ways, and we evaluate the combined effect of the size of the language models and different RDF syntaxes on the task of relation extraction from Wikipedia abstracts. Copyright © 2024, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#881
-
Ringwald 2024
Well-Written Knowledge Graphs: Most Effective RDF Syntaxes for Triple Linearization in End-to-End Extraction of Relations from Texts (Student Abstract)
38th AAAI Conference on Artificial Intelligence (AAAI) / 36th Conference on Innovative Applications of Artificial Intelligence / 14th Symposium on Educational Advances in Artificial Intelligence 2024;():23631-23632 Vancouver, CANADA Assoc Advancement Artificial Intelligence 2024 Ref ID: 3130 Seq-to-seq generative models recently gained attention for solving the relation extraction task. By approaching this problem as an end-to-end task, they surpassed encoder-based-only models. Little research investigated the effects of the output syntaxes on the training process of these models. Moreover, a limited number of approaches were proposed for extracting ready-to-load knowledge graphs following the RDF standard. In this paper, we consider that a set of triples can be linearized in many different ways, and we evaluate the combined effect of the size of the language models and different RDF syntaxes on the task of relation extraction from Wikipedia abstracts. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#703
-
Ristoski 2016
RDF2Vec: RDF Graph Embeddings for Data Mining
15th International Semantic Web Conference (ISWC) 2016;9981():498-514 Kobe, JAPAN Springer International Publishing Ag 2016 DOI: 10.1007/978-3-319-46523-4_30 · Ref ID: 3408 Linked Open Data has been recognized as a valuable source for background information in data mining. However, most data mining tools require features in propositional form, i.e., a vector of nominal or numerical features associated with an instance, while Linked Open Data sources are graphs by nature. In this paper, we present RDF2Vec, an approach that uses language modeling approaches for unsupervised feature extraction from sequences of words, and adapts them to RDF graphs. We generate sequences by leveraging local information from graph substructures, harvested by Weisfeiler-Lehman Subtree RDF Graph Kernels and graph walks, and learn latent numerical representations of entities in RDF graphs. Our evaluation shows that such vector representations outperform existing techniques for the propositionalization of RDF graphs on a variety of different predictive machine learning tasks, and that feature vector representations of general knowledge graphs such as DBpedia and Wikidata can be easily reused for different tasks. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3621
-
Robert 2021
Language Models as a Knowledge Source for Cognitive Agents
arXiv 2021;(): 2021 Ref ID: 7482 Language models (LMs) are sentence-completion engines trained on massive corpora. LMs have emerged as a significant breakthrough in natural-language processing, providing capabilities that go far beyond sentence completion including question answering, summarization, and natural-language inference. While many of these capabilities have potential application to cognitive systems, exploiting language models as a source of task knowledge, especially for task learning, offers significant, near-term benefits. We introduce language models and the various tasks to which they have been applied and then review methods of knowledge extraction from language models. The resulting analysis outlines both the challenges and opportunities for using language models as a new knowledge source for cognitive systems. It also identifies possible ways to improve knowledge extraction from language models using the capabilities provided by cognitive systems. Central to success will be the ability of a cognitive agent to itself learn an abstract model of the knowledge implicit in the LM as well as methods to extract high-quality knowledge effectively and efficiently. To illustrate, we introduce a hypothetical robot agent and describe how language models could extend its task knowledge and improve its performance and the kinds of knowledge and methods the agent can use to exploit the knowledge within a language model. |
Srividya
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1441
-
Rockstroh 2023
A is the B of C: (Semi)-Automatic Creation of Vossian Antonomasias
CEUR Workshop Proceedings 2023;3640(): CEUR-WS 2023 Ref ID: 4968 A Vossian Antonomasia (VA) is a stylistic device used to describe a person (or, more generally, an entity) in terms of a well-known person and a modifying context. For instance, the Norwegian chess world champion Magnus Carlsen was described as "the Mozart of chess"[1]. All VAs follow the pattern where a source (e.g., "Mozart"), is used to describe a target, (e.g., "Magnus Carlsen"), and the transfer of meaning is "channeled"through the use of the modifier "of chess". Although this rhetorical figure is well-known, there has not yet been a dedicated study of targeted automatic or semi-automatic methods to generate and judge the appropriateness of VAs using large Knowledge Graphs (KGs) such as Wikidata. In our work, we propose the use of vector space embeddings - both KG-based and text-based - for producing VAs. For comparison, we contrast our findings with a purely LLM-based approach, wherein VAs are obtained from ChatGPT using a reasonably engineered prompt. We provide a publicly available GitHub repository1 for the implementation of our method and a website2 that allows testing the proposed methods. © 2023 Copyright for this paper by its authors. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2457
-
Romeikat 2011
Formal Specification of Domain-Specific ECA Policy Models
2011 Fifth International Conference on Theoretical Aspects of Software Engineering 2011;():209-212 2011 DOI: 10.1109/TASE.2011.29 · Ref ID: 6234 Policy-based management allows to adapt systems to changed requirements in a flexible and automated way. Policy development usually starts with the specification of high-level policies, which are then refined into a low-level representation. We use models to specify event-condition-action (ECA) policies at different levels of abstraction and consequently separate domain and policy aspects from each other. Domain-specific concepts are used within policies in their event, condition, and action parts. We present a formal specification of the models by means of a relational algebra. The algebra is used to validate the models at each level. Finally, executable policy code is generated from the low-level models. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#847
-
Rony 2022
Tree-KGQA: An Unsupervised Approach for Question Answering Over Knowledge Graphs
Most Knowledge Graph-based Question Answering (KGQA) systems rely on training data to reach their optimal performance. However, acquiring training data for supervised systems is both time-consuming and resource-intensive. To address this, in this paper, we propose Tree-KGQA, an unsupervised KGQA system leveraging pre-trained language models and tree-based algorithms. Entity and relation linking are essential components of any KGQA system. We employ several pre-trained language models in the entity linking task to recognize the entities mentioned in the question and obtain the contextual representation for indexing. Furthermore, for relation linking we incorporate a pre-trained language model previously trained for language inference task. Finally, we introduce a novel algorithm for extracting the answer entities from a KG, where we construct a forest of interpretations and introduce tree-walking and tree disambiguation techniques. Our algorithm uses the linked relation and predicts the tree branches that eventually lead to the potential answer entities. The proposed method achieves 4.5% and 7.1% gains in F1 score in entity linking tasks on LC-QuAD 2.0 and LC-QuAD 2.0 (KBpearl) datasets, respectively, and a 5.4% increase in the relation linking task on LC-QuAD 2.0 (KBpearl). The comprehensive evaluations demonstrate that our unsupervised KGQA approach outperforms other supervised state-of-the-art methods on the WebQSP-WD test set (1.4% increase in F1 score) - without training on the target dataset. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1784
-
Rosati 2016
RDF graph embeddings for content-based recommender systems
CEUR Workshop Proceedings 2016;1673():23-30 CEUR-WS 2016 Ref ID: 5809 Linked Open Data has been recognized as a useful source of background knowledge for building content-based recommender systems. Vast amount of RDF data, covering multiple domains, has been published in freely accessible datasets. In this paper, we present an approach that uses language modeling approaches for unsupervised feature extraction from sequences of words, and adapts them to RDF graphs used for building content-based recommender system. We generate sequences by leveraging local information from graph sub-structures and learn latent numerical representations of entities in RDF graphs. Our evaluation on two datasets in the domain of movies and books shows that feature vector representations of general knowledge graphs such as DBpedia and Wikidata can be effectively used in content-based recommender systems. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2975
-
Roslovtsev 2013
A synthetic approach to building a canonical model of subject areas in the integration bus
2013 3rd International Symposium ISKO-Maghreb 2013;():1-7 2013 DOI: 10.1109/ISKO-Maghreb.2013.6728118 · Ref ID: 6449 This paper is dedicated to the implementation considerations of a canonical model of subject areas in the integration bus and to the definition of data mapping corresponding to this model. The proposed approach to transforming data, when transferring them between individual applications or services of the system, is to convert input messages into output messages using an intermediate canonical representation of the data via rules that map the source model to the canonical one, and the canonical model to the target. Since the canonical model of a subject area serves a technical important task and is not intended for human use, it may be generated automatically instead of manually and designed `from scratch', as a `union' of models used in the various application parts of the system. Subject area models of the application parts being integrated may be written using different formalisms, and yet another formalism may be used for the canonical model, so that a mechanism is required to automatically capture various concepts expressed in various formal systems. In the present paper we focus on developing such a mechanism, based on the automatic generation of a (somewhat simplified) representation of the most important kinds of entities in the canonical model. |
Mike
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2482
-
Rožanc 2013
Framework for web application domain knowledge extraction
2013 36th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO) 2013;():705-710 2013 Ref ID: 6182 A decade ago a web application e-Student was built with aim to provide electronic support for student enrolment and examination/alumni records management at the University of Ljubljana. Due to issues emerging from the Bologna reform a new e-Student is to be build using a modern technology in the near future. The old e-Student encapsulates a huge amount of domain knowledge. Unfortunately, it was developed using agile approach resulting in poor technical documentation, thus an alternative approach for the domain knowledge extraction has to be defined. In the paper a framework for an effective web application domain knowledge extraction is defined. It has five elements. The main principles (1) of extraction are defined to perform effective reengineering of different application views at a defined abstract level. A proper knowledge representation using diverse models (2) has to be determined next, and the Model Driven Architecture using UML models is considered a suitable choice. The procedure (3) for extraction has to be defined using appropriate (usually custom made) tools (4) and performed by skilled staff (5), possibly members of the old development team. The use of framework is demonstrated on the web application e-Student outlining several custom made tools, the results and the most valuable lessons learnt. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2198
-
Rybiński 2022
Beyond Low-Code Development: Marrying Requirements Models and Knowledge Representations
2022 17th Conference on Computer Science and Intelligence Systems (FedCSIS) 2022;():919-928 2022 DOI: 10.15439/2022F129 · Ref ID: 6276 Typical Low-Code Development platforms enable model-driven generation of web applications from high-level visual notations. They normally express the UI and the application logic, which allows generating the frontend and basic CRUD operations. However, more complex domain logic (data processing) operations still necessitate the use of traditional programming. This paper presents a visual language, called RSL-DL, to represent domain knowledge with complex domain rules aligned with requirements models. The language synthesises and extends approaches found in knowledge representation (ontologies) and software modelling language engineering. Its purpose is to enable a fully automatic generation of domain logic code by reasoning over and reusing domain knowledge. The language’s abstract syntax is defined using a meta-model expressed in MOF. Its semantics is expressed with several translational rules that map RSL-DL models onto typical programming language constructs. The rules are explained informally in natural language and formalised using a graphical transformation notation. It is also supported by introducing an inference engine that enables processing queries to domain models and selecting appropriate invocations to generated code. The presented language was implemented by building a dedicated model editor and transformation engine. It was also initially validated through usability studies. Based on these results, we conclude that declarative knowledge representations can be successfully used to produce imperative back-end code with non-trivial logic. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3287
-
Saberi 2024
Context-Augmented Code Generation Using Programming Knowledge Graphs
arXiv 2024;(): 2024 Ref ID: 8747 Large Language Models (LLMs) and Code-LLMs (CLLMs) have significantly improved code generation, but, they frequently face difficulties when dealing with challenging and complex problems. Retrieval-Augmented Generation (RAG) addresses this issue by retrieving and integrating external knowledge at the inference time. However, retrieval models often fail to find most relevant context, and generation models, with limited context capacity, can hallucinate when given irrelevant data. We present a novel framework that leverages a Programming Knowledge Graph (PKG) to semantically represent and retrieve code. This approach enables fine-grained code retrieval by focusing on the most relevant segments while reducing irrelevant context through a tree-pruning technique. PKG is coupled with a re-ranking mechanism to reduce even more hallucinations by selectively integrating non-RAG solutions. We propose two retrieval approaches-block-wise and function-wise-based on the PKG, optimizing context granularity. Evaluations on the HumanEval and MBPP benchmarks show our method improves pass@1 accuracy by up to 20%, and outperforms state-of-the-art models by up to 34% on MBPP. Our contributions include PKG-based retrieval, tree pruning to enhance retrieval precision, a re-ranking method for robust solution selection and a Fill-in-the-Middle (FIM) enhancer module for automatic code augmentation with relevant comments and docstrings. |
Davis
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#711
-
Safavi 2021
Relational World Knowledge Representation in Contextual Language Models: A Review
Conference on Empirical Methods in Natural Language Processing (EMNLP) 2021;():1053-1067 Punta Cana, DOMINICAN REP Assoc Computational Linguistics-Acl 2021 Ref ID: 3589 Relational knowledge bases (KBs) are commonly used to represent world knowledge in machines. However, while advantageous for their high degree of precision and interpretability, KBs are usually organized according to manually-defined schemas, which limit their expressiveness and require significant human efforts to engineer and maintain. In this review, we take a natural language processing perspective to these limitations, examining how they may be addressed in part by training deep contextual language models (LMs) to internalize and express relational knowledge in more flexible forms. We propose to organize knowledge representation strategies in LMs by the level of KB supervision provided, from no KB supervision at all to entity- and relation-level supervision. Our contributions are threefold: (1) We provide a high-level, extensible taxonomy for knowledge representation in LMs; (2) Within our taxonomy, we highlight notable models, evaluation tasks, and findings, in order to provide an up-to-date review of current knowledge representation capabilities in LMs; and (3) We suggest future research directions that build upon the complementary aspects of LMs and KBs as knowledge representations. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3307
-
Saha 2023
A Cross-Domain Evaluation of Approaches for Causal Knowledge Extraction
arXiv 2023;(): 2023 Ref ID: 7795 Causal knowledge extraction is the task of extracting relevant causes and effects from text by detecting the causal relation. Although this task is important for language understanding and knowledge discovery, recent works in this domain have largely focused on binary classification of a text segment as causal or non-causal. In this regard, we perform a thorough analysis of three sequence tagging models for causal knowledge extraction and compare it with a span based approach to causality extraction. Our experiments show that embeddings from pre-trained language models (e.g. BERT) provide a significant performance boost on this task compared to previous state-of-the-art models with complex architectures. We observe that span based models perform better than simple sequence tagging models based on BERT across all 4 data sets from diverse domains with different types of cause-effect phrases. |
Davis
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2169
-
Saini 2021
Automated Traceability for Domain Modelling Decisions Empowered by Artificial Intelligence
2021 IEEE 29th International Requirements Engineering Conference (RE) 2021;():173-184 2021 DOI: 10.1109/RE51729.2021.00023 · Ref ID: 6300 Domain modelling abstracts real-world entities and their relationships in the form of class diagrams for a given domain problem space. Modellers often perform domain modelling to reduce the gap between understanding the problem description which expresses requirements in natural language and the concise interpretation of these requirements. However, the manual practice of domain modelling is both time-consuming and error-prone. These issues are further aggravated when problem descriptions are long, which makes it hard to trace modelling decisions from domain models to problem descriptions or vice-versa leading to completeness and conciseness issues. Automated support for tracing domain modelling decisions in both directions is thus advantageous. In this paper, we propose an automated approach that uses artificial intelligence techniques to extract domain models along with their trace links. We present a traceability information model to enable traceability of modelling decisions in both directions and provide its proof-of-concept in the form of a tool. The evaluation on a set of unseen problem descriptions shows that our approach is promising with an overall median F2 score of 82.04%. We conduct an exploratory user study to assess the benefits and limitations of our approach and present the lessons learned from this study. |
Mike
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1208
-
Sakai 2024
Does Pre-trained Language Model Actually Infer Unseen Links in Knowledge Graph Completion?
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2024 2024;1():8084-8099 Association for Computational Linguistics (ACL) 2024 Ref ID: 4443 Knowledge graphs (KGs) consist of links that describe relationships between entities. Due to the difficulty of manually enumerating all relationships between entities, automatically completing them is essential for KGs. Knowledge Graph Completion (KGC) is a task that infers unseen relationships between entities in a KG. Traditional embedding-based KGC methods (e.g. RESCAL, TransE, DistMult, ComplEx, RotatE, HAKE, HousE, etc.) infer missing links using only the knowledge from training data. In contrast, the recent Pre-trained Language Model (PLM)-based KGC utilizes knowledge obtained during pre-training, which means it can estimate missing links between entities by reusing memorized knowledge from pre-training without inference. This part is problematic because building KGC models aims to infer unseen links between entities. However, conventional evaluations in KGC do not consider inference and memorization abilities separately. Thus, a PLM-based KGC method, which achieves high performance in current KGC evaluations, may be ineffective in practical applications. To address this issue, we analyze whether PLM-based KGC methods make inferences or merely access memorized knowledge. For this purpose, we propose a method for constructing synthetic datasets specified in this analysis and conclude that PLMs acquire the inference abilities required for KGC through pre-training, even though the performance improvements mostly come from textual information of entities and relations. © 2024 Association for Computational Linguistics. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1948
-
Sakhovskiy 2024
TextGraphs 2024 Shared Task on Text-Graph Representations for Knowledge Graph Question Answering
TextGraphs at ACL 2024 - Proceedings of TextGraphs-17: Graph-Based Methods for Natural Language Processing, 62nd Annual Meeting of the Association of Computational Linguistics 2024;():116-125 Association for Computational Linguistics (ACL) 2024 Ref ID: 4264 This paper describes the results of the Knowledge Graph Question Answering (KGQA) shared task that was co-located with the TextGraphs 2024 workshop.1 In this task, given a textual question and a list of entities with the corresponding KG subgraphs, the participating system should choose the entity that correctly answers the question. Our competition attracted thirty teams, four of which outperformed our strong ChatGPT-based zero-shot baseline. In this paper, we overview the participating systems and analyze their performance according to a large-scale automatic evaluation. To the best of our knowledge, this is the first competition aimed at the KGQA problem using the interaction between large language models (LLMs) and knowledge graphs. © 2024 Association for Computational Linguistics. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3514
-
Salinas 2023
"Im not Racist but...": Discovering Bias in the Internal Knowledge of Large Language Models
arXiv 2023;(): 2023 Ref ID: 7892 Large language models (LLMs) have garnered significant attention for their remarkable performance in a continuously expanding set of natural language processing tasks. However, these models have been shown to harbor inherent societal biases, or stereotypes, which can adversely affect their performance in their many downstream applications. In this paper, we introduce a novel, purely prompt-based approach to uncover hidden stereotypes within any arbitrary LLM. Our approach dynamically generates a knowledge representation of internal stereotypes, enabling the identification of biases encoded within the LLM's internal knowledge. By illuminating the biases present in LLMs and offering a systematic methodology for their analysis, our work contributes to advancing transparency and promoting fairness in natural language processing systems. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2866
-
Sang 2022
A Scalable Embedding Based Neural Network Method for Discovering Knowledge From Biomedical Literature
IEEE/ACM Transactions on Computational Biology and Bioinformatics 2022;19(3):1294-1301 2022 DOI: 10.1109/TCBB.2020.3003947 · Ref ID: 6033 Nowadays, the amount of biomedical literatures is growing at an explosive speed, and much useful knowledge is yet undiscovered in the literature. Classical information retrieval techniques allow to access explicit information from a given collection of information, but are not able to recognize implicit connections. Literature-based discovery (LBD) is characterized by uncovering hidden associations in non-interacting literature. It could significantly support scientific research by identifying new connections between biomedical entities. However, most of the existing approaches to LBD are not scalable and may not be sufficient to detect complex associations in non-directly-connected literature. In this article, we present a model which incorporates biomedical knowledge graph, graph embedding, and deep learning methods for literature-based discovery. First, the relations between biomedical entities are extracted from biomedical abstracts and then a knowledge graph is constructed by using these obtained relations. Second, the graph embedding technologies are applied to convert the entities and relations in the knowledge graph into a low-dimensional vector space. Third, a bidirectional Long Short-Term Memory (BLSTM) network is trained based on the entity associations represented by the pre-trained graph embeddings. Finally, the learned model is used for open and closed literature-based discovery tasks. The experimental results show that our method could not only effectively discover hidden associations between entities, but also reveal the corresponding mechanism of interactions. It suggests that incorporating knowledge graph and deep learning methods is an effective way for capturing the underlying complex associations between entities hidden in the literature. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3571
-
Sanmartin 2024
KG-RAG: Bridging the Gap Between Knowledge and Creativity
arXiv 2024;(): 2024 Ref ID: 8298 Ensuring factual accuracy while maintaining the creative capabilities of Large Language Model Agents (LMAs) poses significant challenges in the development of intelligent agent systems. LMAs face prevalent issues such as information hallucinations, catastrophic forgetting, and limitations in processing long contexts when dealing with knowledge-intensive tasks. This paper introduces a KG-RAG (Knowledge Graph-Retrieval Augmented Generation) pipeline, a novel framework designed to enhance the knowledge capabilities of LMAs by integrating structured Knowledge Graphs (KGs) with the functionalities of LLMs, thereby significantly reducing the reliance on the latent knowledge of LLMs. The KG-RAG pipeline constructs a KG from unstructured text and then performs information retrieval over the newly created graph to perform KGQA (Knowledge Graph Question Answering). The retrieval methodology leverages a novel algorithm called Chain of Explorations (CoE) which benefits from LLMs reasoning to explore nodes and relationships within the KG sequentially. Preliminary experiments on the ComplexWebQuestions dataset demonstrate notable improvements in the reduction of hallucinated content and suggest a promising path toward developing intelligent systems adept at handling knowledge-intensive tasks. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3835
-
Sannidhi 2024
Retrieval-Augmented Generation Meets Data-Driven Tabula Rasa Approach for Temporal Knowledge Graph Forecasting
arXiv 2024;(): 2024 Ref ID: 8556 Pre-trained large language models (PLLMs) like OpenAI ChatGPT and Google Gemini face challenges such as inaccurate factual recall, hallucinations, biases, and future data leakage for temporal Knowledge Graph (tKG) forecasting. To address these issues, we introduce sLA-tKGF (small-scale language assistant for tKG forecasting), which utilizes Retrieval-Augmented Generation (RAG) aided, custom-trained small-scale language models through a tabula rasa approach from scratch for effective tKG forecasting. Our framework constructs knowledge-infused prompts with relevant historical data from tKGs, web search results, and PLLMs-generated textual descriptions to understand historical entity relationships prior to the target time. It leverages these external knowledge-infused prompts for deeper understanding and reasoning of context-specific semantic and temporal information to zero-shot prompt small-scale language models for more accurate predictions of future events within tKGs. It reduces hallucinations and mitigates distributional shift challenges through comprehending changing trends over time. As a result, it enables more accurate and contextually grounded forecasts of future events while minimizing computational demands. Rigorous empirical studies demonstrate our framework robustness, scalability, and state-of-the-art (SOTA) performance on benchmark datasets with interpretable and trustworthy tKG forecasting. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3484
-
Sansford 2024
GraphEval: A Knowledge-Graph Based LLM Hallucination Evaluation Framework
arXiv 2024;(): 2024 Ref ID: 8460 Methods to evaluate Large Language Model (LLM) responses and detect inconsistencies, also known as hallucinations, with respect to the provided knowledge, are becoming increasingly important for LLM applications. Current metrics fall short in their ability to provide explainable decisions, systematically check all pieces of information in the response, and are often too computationally expensive to be used in practice. We present GraphEval: a hallucination evaluation framework based on representing information in Knowledge Graph (KG) structures. Our method identifies the specific triples in the KG that are prone to hallucinations and hence provides more insight into where in the response a hallucination has occurred, if at all, than previous methods. Furthermore, using our approach in conjunction with state-of-the-art natural language inference (NLI) models leads to an improvement in balanced accuracy on various hallucination benchmarks, compared to using the raw NLI models. Lastly, we explore the use of GraphEval for hallucination correction by leveraging the structure of the KG, a method we name GraphCorrect, and demonstrate that the majority of hallucinations can indeed be rectified. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3506
-
Sarmah 2024
HybridRAG: Integrating Knowledge Graphs and Vector Retrieval Augmented Generation for Efficient Information Extraction
arXiv 2024;(): 2024 Ref ID: 8522 Extraction and interpretation of intricate information from unstructured text data arising in financial applications, such as earnings call transcripts, present substantial challenges to large language models (LLMs) even using the current best practices to use Retrieval Augmented Generation (RAG) (referred to as VectorRAG techniques which utilize vector databases for information retrieval) due to challenges such as domain specific terminology and complex formats of the documents. We introduce a novel approach based on a combination, called HybridRAG, of the Knowledge Graphs (KGs) based RAG techniques (called GraphRAG) and VectorRAG techniques to enhance question-answer (Q&A) systems for information extraction from financial documents that is shown to be capable of generating accurate and contextually relevant answers. Using experiments on a set of financial earning call transcripts documents which come in the form of Q&A format, and hence provide a natural set of pairs of ground-truth Q&As, we show that HybridRAG which retrieves context from both vector database and KG outperforms both traditional VectorRAG and GraphRAG individually when evaluated at both the retrieval and generation stages in terms of retrieval accuracy and answer generation. The proposed technique has applications beyond the financial domain |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3933
-
Sarto 2024
Towards Retrieval-Augmented Architectures for Image Captioning
arXiv 2024;(): 2024 Ref ID: 8304 The objective of image captioning models is to bridge the gap between the visual and linguistic modalities by generating natural language descriptions that accurately reflect the content of input images. In recent years, researchers have leveraged deep learning-based models and made advances in the extraction of visual features and the design of multimodal connections to tackle this task. This work presents a novel approach towards developing image captioning models that utilize an external kNN memory to improve the generation process. Specifically, we propose two model variants that incorporate a knowledge retriever component that is based on visual similarities, a differentiable encoder to represent input images, and a kNN-augmented language model to predict tokens based on contextual cues and text retrieved from the external memory. We experimentally validate our approach on COCO and nocaps datasets and demonstrate that incorporating an explicit external memory can significantly enhance the quality of captions, especially with a larger retrieval corpus. This work provides valuable insights into retrieval-augmented captioning models and opens up new avenues for improving image captioning at a larger scale. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1575
-
Sawant 2013
Learning joint query interpretation and response ranking
WWW 2013 - Proceedings of the 22nd International Conference on World Wide Web 2013;():1099-1109 Association for Computing Machinery 2013 DOI: 10.1145/2488388.2488484 · Ref ID: 5839 Thanks to information extraction and semantic Web efforts, search on unstructured text is increasingly refined using semantic annotations and structured knowledge bases. However, most users cannot become familiar with the schema of knowledge bases and ask structured queries. Interpreting free-format queries into a more structured representation is of much current interest. The dominant paradigm is to segment or partition query tokens by purpose (references to types, entities, attribute names, attribute values, relations) and then launch the interpreted query on structured knowledge bases. Given that structured knowledge extraction is never complete, here we choose a less trodden path: a data representation that retains the unstructured text corpus, along with structured annotations (mentions of entities and relationships) on it. We propose two new, natural formulations for joint query interpretation and response ranking that exploit bidirectional flow of information between the knowledge base and the corpus. One, inspired by probabilistic language models, computes expected response scores over the uncertainties of query interpretation. The other is based on max-margin discriminative learning, with latent variables representing those uncertainties. In the context of typed entity search, both formulations bridge a considerable part of the accuracy gap between a generic query that does not constrain the type at all, and the upper bound where the "perfect" target entity type of each query is provided by humans. Our formulations are also superior to a two-stage approach of first choosing a target type using recent query type prediction techniques, and then launching a type-restricted entity search query. Copyright is held by the International World Wide Web Conference Committee (IW3C2). |
yuexi
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1182
-
Sawczyn 2024
Developing PUGG for Polish: A Modern Approach to KBQA, MRC, and IR Dataset Construction
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():10978-10996 Association for Computational Linguistics (ACL) 2024 Ref ID: 4280 Advancements in AI and natural language processing have revolutionized machine-human language interactions, with question answering (QA) systems playing a pivotal role. The knowledge base question answering (KBQA) task, utilizing structured knowledge graphs (KG), allows for handling extensive knowledge-intensive questions. However, a significant gap exists in KBQA datasets, especially for low-resource languages. Many existing construction pipelines for these datasets are outdated and inefficient in human labor, and modern assisting tools like Large Language Models (LLM) are not utilized to reduce the workload. To address this, we have designed and implemented a modern, semi-automated approach for creating datasets, encompassing tasks such as KBQA, Machine Reading Comprehension (MRC), and Information Retrieval (IR), tailored explicitly for low-resource environments. We executed this pipeline and introduced the PUGG dataset, the first Polish KBQA dataset, and novel datasets for MRC and IR. Additionally, we provide a comprehensive implementation, insightful findings, detailed statistics, and evaluation of baseline models. © 2024 Association for Computational Linguistics. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3795
-
Scheerer 2024
QirK: Question Answering via Intermediate Representation on Knowledge Graphs
arXiv 2024;(): 2024 Ref ID: 8531 We demonstrate QirK, a system for answering natural language questions on Knowledge Graphs (KG). QirK can answer structurally complex questions that are still beyond the reach of emerging Large Language Models (LLMs). It does so using a unique combination of database technology, LLMs, and semantic search over vector embeddings. The glue for these components is an intermediate representation (IR). The input question is mapped to IR using LLMs, which is then repaired into a valid relational database query with the aid of a semantic search on vector embeddings. This allows a practical synthesis of LLM capabilities and KG reliability. A short video demonstrating QirK is available at https://youtu.be/6c81BLmOZ0U. |
yuexi
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3235
-
Schneider 2024
Bridging Information Gaps in Dialogues With Grounded Exchanges Using Knowledge Graphs
arXiv 2024;(): 2024 Ref ID: 8501 Knowledge models are fundamental to dialogue systems for enabling conversational interactions, which require handling domain-specific knowledge. Ensuring effective communication in information-providing conversations entails aligning user understanding with the knowledge available to the system. However, dialogue systems often face challenges arising from semantic inconsistencies in how information is expressed in natural language compared to how it is represented within the system's internal knowledge. To address this problem, we study the potential of large language models for conversational grounding, a mechanism to bridge information gaps by establishing shared knowledge between dialogue participants. Our approach involves annotating human conversations across five knowledge domains to create a new dialogue corpus called BridgeKG. Through a series of experiments on this dataset, we empirically evaluate the capabilities of large language models in classifying grounding acts and identifying grounded information items within a knowledge graph structure. Our findings offer insights into how these models use in-context learning for conversational grounding tasks and common prediction errors, which we illustrate with examples from challenging dialogues. We discuss how the models handle knowledge graphs as a semantic layer between unstructured dialogue utterances and structured information items. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2733
-
Schoch 2024
NL2IBE – Ontology-controlled Transformation of Natural Language into Formalized Engineering Artefacts
2024 IEEE Conference on Artificial Intelligence (CAI) 2024;():997-1004 2024 DOI: 10.1109/CAI59869.2024.00182 · Ref ID: 6542 Looking at Process and Automation Engineering (P&AE) today, for the technically adept engineer, there are many different tools available to support the engineering work from translation of engineering intentions into module and plant descriptions, to definition and parametrization of entire process plant setups, for export to a control system. However, still today, in the very early engineering phases, engineering intentions either need to be entered already in a structured and controlled expert language or require a human expert’s manual efforts for translation from unstructured language into formalized representations, in order for thereon-based consistent further processing in the existing tools. This process is time-consuming, fuzzy, and error-prone due to potential misconceptions and ambiguities, even for domain experts. In this work, we therefore present our NL2IBE Tool, which makes use of modern Natural Language Processing in combination with Ontology Mining, and which, based on and controlled by an underlying ontology, allows for the deterministic transformation of natural language intentions into structured and consistent engineering artefacts. We describe the overall tool architecture as well as crucial functionalities and implementation features, followed by an evaluation by the example of a hydrogen generation and CCSU use case. We conclude with a discussion of the proposed tool and give an outlook on future research. (Abstract) |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2824
-
Schorlemmer 2011
Reasoning about Distributed Knowledge-Transforming Peer Interactions
IEEE Transactions on Knowledge and Data Engineering 2011;23(9):1419-1431 2011 DOI: 10.1109/TKDE.2010.265 · Ref ID: 6028 We address the problem of how to reason about properties of knowledge transformations as they occur in distributed and decentralized interactions between large and complex artifacts, such as databases, web services, and ontologies. Based on the conceptual distinction between specifications of interactions and properties of knowledge transformations that follow from these interactions, we explore a novel mixture of process calculus and property inference by connecting interaction models with knowledge transformation rules. We aim at being generic in our exploration, hence our emphasis on abstract knowledge transformations, although we exemplify it using a lightweight specification language for interaction modeling (for which an executable peer-to-peer environment already exists) and provide a formal semantics for knowledge transformation rules using the theory of institutions. Consequently, our exploration is also an example of the gain obtained by linking current state-of-the-art distributed knowledge engineering based on web services and peer-based architectures with formal methods drawn from a long tradition in algebraic specification. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3924
-
Sengupta 2024
Towards Efficient Methods in Medical Question Answering using Knowledge Graph Embeddings
arXiv 2024;(): 2024 Ref ID: 8035 In Natural Language Processing (NLP), Machine Reading Comprehension (MRC) is the task of answering a question based on a given context. To handle questions in the medical domain, modern language models such as BioBERT, SciBERT and even ChatGPT are trained on vast amounts of in-domain medical corpora. However, in-domain pre-training is expensive in terms of time and resources. In this paper, we propose a resource-efficient approach for injecting domain knowledge into a model without relying on such domain-specific pre-training. Knowledge graphs are powerful resources for accessing medical information. Building on existing work, we introduce a method using Multi-Layer Perceptrons (MLPs) for aligning and integrating embeddings extracted from medical knowledge graphs with the embedding spaces of pre-trained language models (LMs). The aligned embeddings are fused with open-domain LMs BERT and RoBERTa that are fine-tuned for two MRC tasks, span detection (COVID-QA) and multiple-choice questions (PubMedQA). We compare our method to prior techniques that rely on a vocabulary overlap for embedding alignment and show how our method circumvents this requirement to deliver better performance. On both datasets, our method allows BERT/RoBERTa to either perform on par (occasionally exceeding) with stronger domain-specific models or show improvements in general over prior techniques. With the proposed approach, we signal an alternative method to in-domain pre-training to achieve domain proficiency. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#661
-
Shang 2019
Pre-training of Graph Augmented Transformers for Medication Recommendation
28th International Joint Conference on Artificial Intelligence 2019;():5953-5959 Macao, PEOPLES R CHINA Ijcai-Int Joint Conf Artif Intell 2019 Ref ID: 3686 Medication recommendation is an important healthcare application. It is commonly formulated as a temporal prediction task. Hence, most existing works only utilize longitudinal electronic health records (EHRs) from a small number of patients with multiple visits ignoring a large number of patients with a single visit (selection bias). Moreover, important hierarchical knowledge such as diagnosis hierarchy is not leveraged in the representation learning process. To address these challenges, we propose G-BERT, a new model to combine the power of Graph Neural Networks (GNNs) and BERT (Bidirectional Encoder Representations from Transformers) for medical code representation and medication recommendation. We use GNNs to represent the internal hierarchical structures of medical codes. Then we integrate the GNN representation into a transformer-based visit encoder and pre-train it on EHR data from patients only with a single visit. The pre-trained visit encoder and representation are then fine-tuned for downstream predictive tasks on longitudinal EHRs from patients with multiple visits. G-BERT is the first to bring the language model pre-training schema into the healthcare domain and it achieved state-of-the-art performance on the medication recommendation task. |
Kwesi
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1880
-
Shang 2022
Sequential Semantic Knowledge Graph Embedding
Lecture Notes in Electrical Engineering 2022;861 LNEE():1547-1557 Springer Science and Business Media Deutschland GmbH 2022 DOI: 10.1007/978-981-16-9492-9_153 · Ref ID: 5532 Knowledge graph embedding is aimed at representing entities and relations of knowledge graph in a low-dimensional continuous vector space. Previous embedding models pay little attention to the sequential semantic information in triples and as a result, may lead to the semantic drift problem. Towards this end, we propose a novel sequential semantic embedding (SeqSemE) model to address this problem in this paper. Firstly, we utilize a sequential language model to capture sequential information of triples and interactions between entities and relations. Secondly, we propose a method of learning two embeddings for each relation to avoid semantic drift. Extensive experiments on link prediction show that our SeqSemE is efficient and effective. It can obtain better performance than previous state-of-the-art embedding models. © 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. |
Mike
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#214
-
Shao 2024
Enhancing Bug Report Summaries Through Knowledge-Specific and Contrastive Learning Pre-Training
Bug reports are crucial in software maintenance, with concise summaries significantly enhancing the efficiency of bug triagers and ultimately contributing to the development of high-quality software products. Contemporary methods for automatic bug report summarization primarily utilize neural networks' robust learning capabilities. However, these approaches often produce suboptimal summaries due to two primary limitations: 1) the difficulty in assimilating the domain-specific knowledge inherent in bug reports, and 2) the limitations of purely supervised learning in comprehending the comprehensive context of bug reports. To address the above two problems, in this paper, we propose a new approach for bug report summarization, namely KSCLP, which leverages large language models and domain-specific pre-training strategies, i.e., Knowledge-Specific and Contrastive Learning Pre-training. Specifically, the Knowledge-Specific strategy allows to pre-train KSCLP on project-specific bug reports corpus, by which the model can fully learn internal knowledge of bug reports, learning bug report-aware representation. As for the Contrastive Learning strategy, it performs a sequence-level pre-training for KSCLP, helping it capture the semantic information of bug reports on a global level. Upon completion of the pre-training phase, KSCLP undergoes further refinement through a Sequence-to-Sequence framework specifically tailored for bug report summarization. The efficacy of KSCLP is rigorously evaluated against five baseline models using a publicly available dataset. The empirical results demonstrate that KSCLP outperforms all baselines, achieving remarkable improvements by up to 23.73, 13.97, and 20.89 points in ROUGE-1, ROUGE-2, and ROUGE-L metrics, thereby setting new benchmarks in the field of bug report summarization. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1700
-
Shao 2024
On Linearizing Structured Data in Encoder-Decoder Language Models: Insights from Text-to-SQL
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2024 2024;1():131-156 Association for Computational Linguistics (ACL) 2024 Ref ID: 4440 Structured data, prevalent in tables, databases, and knowledge graphs, poses a significant challenge in its representation. With the advent of large language models (LLMs), there has been a shift towards linearization-based methods, which process structured data as sequential token streams, diverging from approaches that explicitly model structure, often as a graph. Crucially, there remains a gap in our understanding of how these linearization-based methods handle structured data, which is inherently non-linear. This work investigates the linear handling of structured data in encoder-decoder language models, specifically T5. Our findings reveal the model’s ability to mimic human-designed processes such as schema linking and syntax prediction, indicating a deep, meaningful learning of structure beyond simple token sequencing. We also uncover insights into the model’s internal mechanisms, including the ego-centric nature of structure node encodings and the potential for model compression due to modality fusion redundancy. Overall, this work sheds light on the inner workings of linearization-based methods and could potentially provide guidance for future research. © 2024 Association for Computational Linguistics. |
Srividya
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3512
-
Shapurian 2023
Identifying Planetary Names in Astronomy Papers: A Multi-Step Approach
arXiv 2023;(): 2023 Ref ID: 7985 The automatic identification of planetary feature names in astronomy publications presents numerous challenges. These features include craters, defined as roughly circular depressions resulting from impact or volcanic activity; dorsas, which are elongate raised structures or wrinkle ridges; and lacus, small irregular patches of dark, smooth material on the Moon, referred to as "lake" (Planetary Names Working Group, n.d.). Many feature names overlap with places or people's names that they are named after, for example, Syria, Tempe, Einstein, and Sagan, to name a few (U.S. Geological Survey, n.d.). Some feature names have been used in many contexts, for instance, Apollo, which can refer to mission, program, sample, astronaut, seismic, seismometers, core, era, data, collection, instrument, and station, in addition to the crater on the Moon. Some feature names can appear in the text as adjectives, like the lunar craters Black, Green, and White. Some feature names in other contexts serve as directions, like craters West and South on the Moon. Additionally, some features share identical names across different celestial bodies, requiring disambiguation, such as the Adams crater, which exists on both the Moon and Mars. We present a multi-step pipeline combining rule-based filtering, statistical relevance analysis, part-of-speech (POS) tagging, named entity recognition (NER) model, hybrid keyword harvesting, knowledge graph (KG) matching, and inference with a locally installed large language model (LLM) to reliably identify planetary names despite these challenges. When evaluated on a dataset of astronomy papers from the Astrophysics Data System (ADS), this methodology achieves an F1-score over 0.97 in disambiguating planetary feature names. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#359
-
Sharifzadeh 2022
Improving Scene Graph Classification by Exploiting Knowledge from Texts
36th AAAI Conference on Artificial Intelligence / 34th Conference on Innovative Applications of Artificial Intelligence / 12th Symposium on Educational Advances in Artificial Intelligence 2022;():2189-2197 Electr Network Assoc Advancement Artificial Intelligence 2022 Ref ID: 3339 Training scene graph classification models requires a large amount of annotated image data. Meanwhile, scene graphs represent relational knowledge that can be modeled with symbolic data from texts or knowledge graphs. While image annotation demands extensive labor, collecting textual descriptions of natural scenes requires less effort. In this work, we investigate whether textual scene descriptions can substitute for annotated image data. To this end, we employ a scene graph classification framework that is trained not only from annotated images but also from symbolic data. In our architecture, the symbolic entities are first mapped to their correspondent image-grounded representations and then fed into the relational reasoning pipeline. Even though a structured form of knowledge, such as the form in knowledge graphs, is not always available, we can generate it from unstructured texts using a transformer-based language model. We show that by fine-tuning the classification pipeline with the extracted knowledge from texts, we can achieve similar to 8x more accurate results in scene graph classification, similar to 3x in object classification, and similar to 1.5x in predicate classification, compared to the supervised baselines with only 1% of the annotated images. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#806
-
Sharma 2021
T<SUP>3</SUP>: Domain-Agnostic Neural Time-series Narration
21st IEEE International Conference on Data Mining (IEEE ICDM) 2021;():1324-1329 Electr Network Ieee Computer Soc 2021 DOI: 10.1109/icdm51629.2021.00165 · Ref ID: 3399 The task of generating rich and fluent narratives that aptly describe the characteristics, trends, and anomalies of time-series data is invaluable to the sciences (geology, meteorology, epidemiology) or finance (trades, stocks). The efforts for time-series narration hitherto are domain-specific and use predefined templates that offer consistency but lead to mechanical narratives. We present T-3 (Time-series-To-Text), a domain-agnostic neural framework for time-series narration, that couples the representation of essential time-series elements in the form of a dense knowledge graph and the translation of said knowledge graph into rich and fluent narratives through the transfer-learning capabilities of PLMs (Pre-trained Language Models). To the best of our knowledge, T-3 is the first investigation of the use of neural strategies for time-series narration. We showcase that T-3 can improve the lexical diversity of the generated narratives by up to 65.38% while still maintaining grammatical integrity. The performance and practicality of T-3 is further validated through an expert review (n = 21) where 76.2% of participating experts wary of auto-generated narratives favored T-3 as a deployable system for time-series narration due to its rich and diverse narratives. Our code-base and the datasets used with detailed instructions for reproducibility is publicly hosted(1). |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3905
-
Sharma 2021
TCube: Domain-Agnostic Neural Time-series Narration
arXiv 2021;(): 2021 Ref ID: 7488 The task of generating rich and fluent narratives that aptly describe the characteristics, trends, and anomalies of time-series data is invaluable to the sciences (geology, meteorology, epidemiology) or finance (trades, stocks, or sales and inventory). The efforts for time-series narration hitherto are domain-specific and use predefined templates that offer consistency but lead to mechanical narratives. We present TCube (Time-series-to-text), a domain-agnostic neural framework for time-series narration, that couples the representation of essential time-series elements in the form of a dense knowledge graph and the translation of said knowledge graph into rich and fluent narratives through the transfer-learning capabilities of PLMs (Pre-trained Language Models). TCube's design primarily addresses the challenge that lies in building a neural framework in the complete paucity of annotated training data for time-series. The design incorporates knowledge graphs as an intermediary for the representation of essential time-series elements which can be linearized for textual translation. To the best of our knowledge, TCube is the first investigation of the use of neural strategies for time-series narration. Through extensive evaluations, we show that TCube can improve the lexical diversity of the generated narratives by up to 65.38% while still maintaining grammatical integrity. The practicality and deployability of TCube is further validated through an expert review (n=21) where 76.2% of participating experts wary of auto-generated narratives favored TCube as a deployable system for time-series narration due to its richer narratives. Our code-base, models, and datasets, with detailed instructions for reproducibility is publicly hosted at https://github.com/Mandar-Sharma/TCube. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1303
-
Shcherbakov 2020
Exploring Looping Effects in RNN-based Architectures
Proceedings of the Australasian Language Technology Workshop 2020;18():6 Australasian Language Technology Association 2020 Ref ID: 5633 The paper investigates repetitive loops, a common problem in contemporary text generation (such as machine translation, language modelling, morphological inflection) systems. We hypothesized that a model’s failure to distinguish respective latent states for different positions in an output sequence may be the primary cause of the looping. Therefore, we propose adding a position-aware discriminating factor to the model in attempt to reduce that effect. We conduct a study on neural models with recurrent units by explicitly altering their decoder internal state. We use a task of morphological reinflection as a proxy to study the effects of the changes. Our results show that the probability of the occurrence of repetitive loops is significantly reduced by introduction of an extra neural decoder output. The output should be specifically trained to produce gradually increasing value upon generation of each character of a given sequence. We also explored variations of the technique and found that feeding the extra output back to the decoder amplifies the positive effects. © 2020, Australasian Language Technology Association. All rights reserved. |
Srividya
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3556
-
Shen 2024
Jailbreak Antidote: Runtime Safety-Utility Balance via Sparse Representation Adjustment in Large Language Models
arXiv 2024;(): 2024 Ref ID: 8652 As large language models (LLMs) become integral to various applications, ensuring both their safety and utility is paramount. Jailbreak attacks, which manipulate LLMs into generating harmful content, pose significant challenges to this balance. Existing defenses, such as prompt engineering and safety fine-tuning, often introduce computational overhead, increase inference latency, and lack runtime flexibility. Moreover, overly restrictive safety measures can degrade model utility by causing refusals of benign queries. In this paper, we introduce Jailbreak Antidote, a method that enables real-time adjustment of LLM safety preferences by manipulating a sparse subset of the model's internal states during inference. By shifting the model's hidden representations along a safety direction with varying strengths, we achieve flexible control over the safety-utility balance without additional token overhead or inference delays. Our analysis reveals that safety-related information in LLMs is sparsely distributed; adjusting approximately 5% of the internal state is as effective as modifying the entire state. Extensive experiments on nine LLMs (ranging from 2 billion to 72 billion parameters), evaluated against ten jailbreak attack methods and compared with six defense strategies, validate the effectiveness and efficiency of our approach. By directly manipulating internal states during reasoning, Jailbreak Antidote offers a lightweight, scalable solution that enhances LLM safety while preserving utility, opening new possibilities for real-time safety mechanisms in widely-deployed AI systems. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#252
-
Shen 2020
Exploiting Structured Knowledge in Text via Graph-Guided Representation Learning
Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020;():8980-8994 Electr Network Assoc Computational Linguistics-Acl 2020 Ref ID: 2978 In this work, we aim at equipping pre-trained language models with structured knowledge. We present two self-supervised tasks learning over raw text with the guidance from knowledge graphs. Building upon entity-level masked language models, our first contribution is an entity masking scheme that exploits relational knowledge underlying the text. This is fulfilled by using a linked knowledge graph to select informative entities and then masking their mentions. In addition, we use knowledge graphs to obtain distractors for the masked entities, and propose a novel distractor-suppressed ranking objective that is optimized jointly with masked language model. In contrast to existing paradigms, our approach uses knowledge graphs implicitly, only during pre-training, to inject language models with structured knowledge via learning from raw text. It is more efficient than retrieval-based methods that perform entity linking and integration during finetuning and inference, and generalizes more effectively than the methods that directly learn from concatenated graph triples. Experiments show that our proposed model achieves improved performance on five benchmarks, including question answering and knowledge base completion. |
Srividya
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2277
-
Shen 2024
Construction of Knowledge Graph of Judicial Case Based on LLMs and Embedding Models
2024 IEEE 2nd International Conference on Sensors, Electronics and Computer Engineering (ICSECE) 2024;():949-955 2024 DOI: 10.1109/ICSECE61636.2024.10729603 · Ref ID: 7075 This paper constructs the Judicial Case Knowledge Graph (JCKG) dataset based on China Judicial Judgements Online, which fills the gap of knowledge graph research in the legal field. First, we systematically collate the relevant data of Judicial Judgements Online, and then use advanced Large-scale Language Models (LLMs) technology to extract the entities and relationships in the data. Then, through the link prediction experiment, the mainstream knowledge graph learning data set was compared horizontally, and the embedding models such as TransE, ConvE, and pRotatE were compared vertically to verify the validity of JCKG data set. Experiments show that JCKG data set can effectively implement multi-hop inference. Legal Knowledge Graph plays an important role in the intelligent judicial system, which improves the efficiency of case processing and the accuracy of judicial decision-making and reduces the risk of errors. This research provides strong support for the field of legal artificial intelligence and promotes the development of the judicial system in the direction of more efficient and intelligent. |
Ishan
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1003
-
Sheng 2023
An Augmentable Domain-specific Models for Financial Analysis
Proceedings - 2023 16th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics, CISP-BMEI 2023 2023;(): Institute of Electrical and Electronics Engineers Inc. 2023 DOI: 10.1109/CISP-BMEI60920.2023.10373245 · Ref ID: 5012 Large-scale language models such as GPT-4 have revolutionized data analysis and interpretation by generating human-like text, automating insights, and detecting data errors. Large-scale language models have been applied in various fields and played an important role in many aspects. Large language models can also perform financial and technical analysis by cleaning data, generating synthetic data, handling bias, and supporting natural language queries. This paper proposes a language model that integrates multimodal data with external knowledge bases and domain-specific data, enhancing its reasoning ability by extending domain-specific data. Reduce hallucinations and fine-tune domain-specific data by incorporating external knowledge bases to deepen model understanding of industry-specific language, concepts, and context. And technologies such as knowledge graph, attention mechanism, cross-modal embedding and federated collaborative training are used to deal with the challenges of different structures and semantics of multi-modal data. The model also employs a feedback loop mechanism to allow the model to adapt to changing conditions, such as changing languages or new domain information. Experimental results show that the proposed model has a preliminary domain-specific ability to analyze and predict multimodal financial and technical data. © 2023 IEEE. |
Davis
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3669
-
Shi 2024
LLM-Powered Explanations: Unraveling Recommendations Through Subgraph Reasoning
arXiv 2024;(): 2024 Ref ID: 8413 Recommender systems are pivotal in enhancing user experiences across various web applications by analyzing the complicated relationships between users and items. Knowledge graphs(KGs) have been widely used to enhance the performance of recommender systems. However, KGs are known to be noisy and incomplete, which are hard to provide reliable explanations for recommendation results. An explainable recommender system is crucial for the product development and subsequent decision-making. To address these challenges, we introduce a novel recommender that synergies Large Language Models (LLMs) and KGs to enhance the recommendation and provide interpretable results. Specifically, we first harness the power of LLMs to augment KG reconstruction. LLMs comprehend and decompose user reviews into new triples that are added into KG. In this way, we can enrich KGs with explainable paths that express user preferences. To enhance the recommendation on augmented KGs, we introduce a novel subgraph reasoning module that effectively measures the importance of nodes and discovers reasoning for recommendation. Finally, these reasoning paths are fed into the LLMs to generate interpretable explanations of the recommendation results. Our approach significantly enhances both the effectiveness and interpretability of recommender systems, especially in cross-selling scenarios where traditional methods falter. The effectiveness of our approach has been rigorously tested on four open real-world datasets, with our methods demonstrating a superior performance over contemporary state-of-the-art techniques by an average improvement of 12%. The application of our model in a multinational engineering and technology company cross-selling recommendation system further underscores its practical utility and potential to redefine recommendation practices through improved accuracy and user trust. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#515
-
Shi 2024
Legal-LM: Knowledge Graph Enhanced Large Language Models for Law Consulting
20th International Conference on Intelligent Computing (ICIC) 2024;14878():175-186 Tianjin Univ Sci & Tech, Tianjin, PEOPLES R CHINA Springer-Verlag Singapore Pte Ltd 2024 DOI: 10.1007/978-981-97-5672-8_15 · Ref ID: 2956 This paper introduces Legal-LM, an advanced Large Language Model (LLM) enhanced with a Knowledge Graph, specifically designed for legal consulting in the Chinese legal domain. Addressing the challenges of domain-specific adaptation, data veracity, and consultations with non-professional users in legal-AI, Legal-LM incorporates extensive legal corpora and a knowledge graph for effective legal knowledge acquisition. The model utilizes techniques such as external legal knowledge basis, soft prompts, and Direct Preference Optimization (DPO) to ensure accurate and diverse legal advice. Our experimental results demonstrate that Legal-LM exhibits superior performance over existing models in legal question answering, case analysis, and legal recommendations, these show its potential to facilitate legal consulting and education. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3727
-
Shi 2024
MUSE: Machine Unlearning Six-Way Evaluation for Language Models
arXiv 2024;(): 2024 Ref ID: 8451 Language models (LMs) are trained on vast amounts of text data, which may include private and copyrighted content. Data owners may request the removal of their data from a trained model due to privacy or copyright concerns. However, exactly unlearning only these datapoints (i.e., retraining with the data removed) is intractable in modern-day models. This has led to the development of many approximate unlearning algorithms. The evaluation of the efficacy of these algorithms has traditionally been narrow in scope, failing to precisely quantify the success and practicality of the algorithm from the perspectives of both the model deployers and the data owners. We address this issue by proposing MUSE, a comprehensive machine unlearning evaluation benchmark that enumerates six diverse desirable properties for unlearned models: (1) no verbatim memorization, (2) no knowledge memorization, (3) no privacy leakage, (4) utility preservation on data not intended for removal, (5) scalability with respect to the size of removal requests, and (6) sustainability over sequential unlearning requests. Using these criteria, we benchmark how effectively eight popular unlearning algorithms on 7B-parameter LMs can unlearn Harry Potter books and news articles. Our results demonstrate that most algorithms can prevent verbatim memorization and knowledge memorization to varying degrees, but only one algorithm does not lead to severe privacy leakage. Furthermore, existing algorithms fail to meet deployer's expectations because they often degrade general model utility and also cannot sustainably accommodate successive unlearning requests or large-scale content removal. Our findings identify key issues with the practicality of existing unlearning algorithms on language models, and we release our benchmark to facilitate further evaluations: muse-bench.github.io |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#97
-
Shi 2023
ChatGraph: Interpretable Text Classification by Converting ChatGPT Knowledge to Graphs
23rd IEEE International Conference on Data Mining (IEEE ICDM) 2023;():515-520 Shanghai, PEOPLES R CHINA Ieee Computer Soc 2023 DOI: 10.1109/icdmw60847.2023.00073 · Ref ID: 3456 ChatGPT, as a recently launched large language model (LLM), has shown superior performance in various natural language processing (NLP) tasks. However, two major limitations hinder its potential applications: 1) the inflexibility of finetuning on downstream tasks, and 2) the lack of interpretability in the decision-making process. To tackle these limitations, we propose a novel framework that leverages the power of ChatGPT for specific tasks, such as text classification, while improving its interpretability. The proposed framework conducts a knowledge graph extraction task to extract refined and structural knowledge from the raw data using ChatGPT. The rich knowledge is then converted into a graph, which is further used to train an interpretable linear classifier to make predictions. To evaluate the effectiveness of our proposed method, we conduct experiments on four benchmark datasets. The results demonstrate that our method can significantly improve the prediction performance compared to directly utilizing ChatGPT for text classification tasks. Furthermore, our method provides a more transparent decision-making process compared with previous text classification methods. The code is available at https://github.com/sycny/ChatGraph. |
Kwesi
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1450
-
Shim 2021
A JOINT FRAMEWORK FOR DISTILLING THE EXPERTISE IN ELECTRIC POWER UTILITY DOMAIN WITH GENBERT
IET Conference Proceedings 2021;2021():1331-1335 Institution of Engineering and Technology 2021 DOI: 10.1049/icp.2021.2165 · Ref ID: 5516 Over the past decades, extracting crucial information such as domain expertise from unstructured data has been considered a great challenge. In the domain of electric power utility, there are similar difficulties. Since the individual experiences and know-hows of electric power utility technicians are not digitized into a database but they are fragmented into so many reports and documents, it is hard to find the right information and the knowledge gap has been widened more and more between workers. Natural language processing (NLP) based on deep learning technologies is emerging as one of the most efficient ways for searching textual information and extracting valuable contexts. Using these techniques, we can make a domain-specific language model and can provide appropriate answers for users' questions. In this study, we propose a joint framework for distilling the expertise in the electric power utility domain with GenBERT. This framework consists of three sub-components: 'Pre-processing', 'Extract', and 'QA' components. To evaluate the performance of our proposed framework, we conducted various comparison experiments on 'Extract' and 'QA' components. As a result, our framework has shown improved the QA performance by answering to the electric power utility domain specific questions with higher accuracy. © 2021 The Institution of Engineering and Technology. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2431
-
Sibunruang 2018
Finding Clinical Knowledge from MEDLINE Abstracts by Text Summarization Technique
2018 International Conference on Information Technology (InCIT) 2018;():1-6 2018 DOI: 10.23919/INCIT.2018.8584867 · Ref ID: 6208 Today, the MEDLINE is an important repository containing more than 26 million citations and abstracts in the fields of medicine, while PubMed provides free access to MEDLINE and links to full-text articles. MEDLINE abstracts becomes a potential source of new knowledge in medical field. However, it is time-consuming and labour-intensive to find knowledge from MEDLINE abstracts, when a search returns much abstracts and each may contain a large volume of information. Therefore, this work aims to present a method of summarizing clinical knowledge from a MEDLINE abstract. The main mechanisms of the proposed method are driven on natural language processing (NLP) and text filtering techniques. The case study of this work is to summarize the clinical knowledge from a MEDLINE abstracts relating to cervical cancer in clinical trials. In the evaluation stage, the actual results obtained from a domain expert are used to compare the predicted results. After testing by recall, precision, and F-score, they return the satisfactory results, where the average of recall, precision, and F-measure are 0.84, 1.00, and 0.91 respectively. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1293
-
Simon 2023
Experiments on GPT-3 Assisted Process Model Development
Proceedings - European Council for Modelling and Simulation, ECMS 2023;2023-June():270-276 European Council for Modelling and Simulation 2023 DOI: 10.7148/2023-0270 · Ref ID: 5280 Computer assisted process model development from textual descriptions is still an open research question. Advantages of such a technology lie in shorter development times and possibly a more concise interpretation of the narrative input. A solution to this problem necessarily relies on methods from formal modeling and linguistics. In the latter field, the new GPT-3 model is recognized as a breakthrough that outperforms previous technologies whose limitations hindered success of earlier research in this context. But are GPT-3’s capabilities to summarize text, detect cause-and-effect, or to classify terms sufficient to succeed? The presented research describes the results of systematic experiments to use GPT-3 to interpret a textual process description and transform it into a formal representation. The different settings demonstrate how to exploit the capabilities of large language models and how to avoid pitfalls. Although the observations made are promising, further work is needed. The outcome of this paper identifies the direction in which this future research should proceed. © ECMS Enrico Vicario, Romeo Bandinelli, Virginia Fani, Michele Mastroianni (Editors) 2023. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#790
-
Sinclair 2022
Structural Persistence in Language Models: Priming as a Window into Abstract Language Representations
We investigate the extent to which modern neural language models are susceptible to structural priming, the phenomenon whereby the structure of a sentence makes the same structure more probable in a follow-up sentence. We explore how priming can be used to study the potential of these models to learn abstract structural information, which is a prerequisite for good performance on tasks that require natural language understanding skills. We introduce a novel metric and release Prime-LM, a large corpus where we control for various linguistic factors that interact with priming strength. We find that Transformer models indeed show evidence of structural priming, but also that the generalizations they learned are to some extent modulated by semantic information. Our experiments also show that the representations acquired by the models may not only encode abstract sequential structure but involve certain level of hierarchical syntactic information. More generally, our study shows that the priming paradigm is a useful, additional tool for gaining insights into the capacities of language models and opens the door to future priming-based investigations that probe the model's internal states.(1) |
Davis
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#96
-
Skryd 2024
ChatGPT as a Tool for Medical Education and ClinicalDecision-Making on the Wards:Case Study
Background: Large language models (LLMs) are computational artificial intelligence systems with advanced natural languageprocessing capabilities that have recently been popularized among health care students and educators due to their ability to providereal-time access to a vast amount of medical knowledge. The adoption of LLM technology into medical education and traininghas varied, and little empirical evidence exists to support its use in clinical teaching environments. Objective: The aim of the study is to identify and qualitatively evaluate potential use cases and limitations of LLM technologyfor real-time ward-based educational contexts. Methods: A brief, single-site exploratory evaluation of the publicly available ChatGPT-3.5 (OpenAI) was conducted byimplementing the tool into the daily attending rounds of a general internal medicine inpatient service at a large urban academicmedical center. ChatGPT was integrated into rounds via both structured and organic use, using the web-based "chatbot" styleinterface to interact with the LLM through conversational free-text and discrete queries. A qualitative approach usingphenomenological inquiry was used to identify key insights related to the use of ChatGPT through analysis of ChatGPT conversationlogs and associated shorthand notes from the clinical sessions. Results: Identified use cases for ChatGPT integration included addressing medical knowledge gaps through discrete medicalknowledge inquiries, building differential diagnoses and engaging dual-process thinking, challenging medical axioms, usingcognitive aids to support acute care decision-making, and improving complex care management by facilitating conversationswith subspecialties. Potential additional uses included engaging in difficult conversations with patients, exploring ethical challengesand general medical ethics teaching, personal continuing medical education resources, developing ward-based teaching tools,supporting and automating clinical documentation, and supporting productivity and task management. LLM biases, misinformation,ethics, and health equity were identified as areas of concern and potential limitations to clinical and training use. A code of conducton ethical and appropriate use was also developed to guide team usage on the wards. Conclusions: Overall, ChatGPT offers a novel tool to enhance ward-based learning through rapid information querying,second-order content exploration, and engaged team discussion regarding generated responses. More research is needed to fullyunderstand contexts for educational use, particularly regarding the risks and limitations of the tool in clinical settings and itsimpacts on trainee development. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2206
-
Smith 2014
BPAL: A tool for managing semantically enriched conceptual process models
eChallenges e-2014 Conference Proceedings 2014;():1-10 2014 Ref ID: 6557 In this paper we will provide an overview of the Business Process Abstract Language (BPAL) Platform, which implements a Business Process (BP) modelling and reasoning environment where the procedural knowledge of a BP can be enriched through ontology-based annotations. The BPAL Platform provides a graphical user interface to ease the definition of a Business Process Knowledge Base that collects the various facets of process knowledge. It also provides a reasoner implementing services for the enactment, verification, retrieval, and composition of processes in the knowledge base. After discussing the functionalities and the architecture of the tool, we report on an experimental evaluation of the whole system, whose results are encouraging and show the viability of the approach. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1699
-
Snyder 2024
On Early Detection of Hallucinations in Factual Question Answering
Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2024;():2721-2732 Association for Computing Machinery 2024 DOI: 10.1145/3637528.3671796 · Ref ID: 3933 While large language models (LLMs) have taken great strides towards helping humans with a plethora of tasks, hallucinations remain a major impediment towards gaining user trust. The fluency and coherence of model generations even when hallucinating makes detection a difficult task. In this work, we explore if the artifacts associated with the model generations can provide hints that the generation will contain hallucinations. Specifically, we probe LLMs at 1) the inputs via Integrated Gradients based token attribution, 2) the outputs via the Softmax probabilities, and 3) the internal state via self-attention and fully-connected layer activations for signs of hallucinations on open-ended question answering tasks. Our results show that the distributions of these artifacts tend to differ between hallucinated and non-hallucinated generations. Building on this insight, we train binary classifiers that use these artifacts as input features to classify model generations into hallucinations and non-hallucinations. These hallucination classifiers achieve up to 0.80 AUROC. We also show that tokens preceding a hallucination can already predict the subsequent hallucination even before it occurs. © 2024 Copyright held by the owner/author(s). |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#306
-
Song 2023
Generative Event Extraction via Internal Knowledge-Enhanced Prompt Learning
32nd International Conference on Artificial Neural Networks (ICANN) 2023;14258():90-102 Heraklion, GREECE Springer International Publishing Ag 2023 DOI: 10.1007/978-3-031-44192-9_8 · Ref ID: 3671 Event extraction is a crucial research task in information extraction. In order to maximize the performances of the pre-trained language model (PLM), some works formulating event extraction as a conditional generation problem. However, most existing generative methods ignore the prior information between event entities, and are usually over-dependent on hand-crafted designed templates, which causing subjective intervention. In this paper, we propose a generative event extraction model named KEPGEE based on internal knowledge-enhanced prompt learning. We firstly use relational graph neural networks (RGCN) to encode the event triples entities and fuse them with the word embeddings to obtain the knowledge representation. Then the knowledge representation is concatenated with task-specific virtual tokens to compose knowledge-enhanced soft prompts, which can provide additional event information to adapt the sequence-to-sequence PLM for the generative event extraction task. Besides, in template design, we add the related topic words into the prompt templates to enhance the implicit event information. We evaluate our model on ACE2005 and ERE datasets, and the results show that our model achieves matched or better performances with several classification-based or generation-based event extraction models (including the state-of-the-art models). |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1445
-
Song 2024
ITAKE: Interactive Unstructured Text Annotation and Knowledge Extraction System with LLMs and ModelOps
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;3():326-334 Association for Computational Linguistics (ACL) 2024 Ref ID: 4344 Extracting structured knowledge from unstructured text data has a wide range of application prospects, and a pervasive trend is to develop text annotation tools to help extraction. However, they often encounter issues such as single scenario usage, lack of effective human-machine collaboration, insufficient model supervision, and suboptimal utilization of Large Language Models (LLMs). We introduces an interactive unstructured text annotation and knowledge extraction system that synergistically integrates LLMs and ModelOps to alleviate these issues. The system leverages LLMs for enhanced performance in low-resource contexts, employs a ModelOps platform to monitor models throughout their lifecycle, and amalgamates interactive annotation methods with online machine learning and active learning. The demo video1 and website2 are now publicly available. © 2024 Association for Computational Linguistics. |
Xinchen
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1662
-
Song 2023
Multilingual Knowledge Graph Completion from Pretrained Language Models with Knowledge Constraints
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2023;():7709-7721 Association for Computational Linguistics (ACL) 2023 Ref ID: 5154 Multilingual Knowledge Graph Completion (mKGC) aim at solving queries like (h, r, ?) in different languages by reasoning a tail entity t thus improving multilingual knowledge graphs. Previous studies leverage multilingual pretrained language models (PLMs) and the generative paradigm to achieve mKGC. Although multilingual pretrained language models contain extensive knowledge of different languages, its pretraining tasks cannot be directly aligned with the mKGC tasks. Moreover, the majority of KGs and PLMs currently available exhibit a pronounced English-centric bias. This makes it difficult for mKGC to achieve good results, particularly in the context of low-resource languages. To overcome previous problems, this paper introduces global and local knowledge constraints for mKGC. The former is used to constrain the reasoning of answer entities, while the latter is used to enhance the representation of query contexts. The proposed method makes the pretrained model better adapt to the mKGC task. Experimental results on public datasets demonstrate that our method outperforms the previous SOTA on Hits@1 and Hits@10 by an average of 12.32% and 16.03%, which indicates that our proposed method has significant enhancement on mKGC. © 2023 Association for Computational Linguistics. |
Xinchen
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2392
-
Song 2024
Enhancing Text-to-SQL Translation for Financial System Design
2024 IEEE/ACM 46th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP) 2024;():252-262 2024 DOI: 10.1145/3639477.3639732 · Ref ID: 6581 Text-to-SQL, the task of translating natural language questions into SQL queries, is part of various business processes. Its automation, which is an emerging challenge, will empower software practitioners to seamlessly interact with relational databases using natural language, thereby bridging the gap between business needs and software capabilities. In this paper, we consider Large Language Models (LLMs), which have achieved state of the art for various NLP tasks. Specifically, we benchmark Text-to-SQL performance, the evaluation methodologies, as well as input optimization (e.g., prompting). In light of the empirical observations that we have made, we propose two novel metrics that were designed to adequately measure the similarity between SQL queries. Overall, we share with the community various findings, notably on how to select the right LLM on Text-to-SQL tasks. We further demonstrate that a tree-based edit distance constitutes a reliable metric for assessing the similarity between generated SQL queries and the oracle for benchmarking Text2SQL approaches. This metric is important as it relieves researchers from the need to perform computationally expensive experiments such as executing generated queries as done in prior works. Our work implements financial domain use cases and, therefore contributes to the advancement of Text2SQL systems and their practical adoption in this domain. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1851
-
Song 2024
Scene-Driven Multimodal Knowledge Graph Construction for Embodied AI
Embodied AI is one of the most popular studies in artificial intelligence and robotics, which can effectively improve the intelligence of real-world agents (i.e. robots) serving human beings. Scene knowledge is important for an agent to understand the surroundings and make correct decisions in the varied open world. Currently, knowledge base for embodied tasks is missing and most existing work use general knowledge base or pre-trained models to enhance the intelligence of an agent. For conventional knowledge base, it is sparse, insufficient in capacity and cost in data collection. For pre-trained models, they face the uncertainty of knowledge and hard maintenance. To overcome the challenges of scene knowledge, we propose a scene-driven multimodal knowledge graph (Scene-MMKG) construction method combining conventional knowledge engineering and large language models. A unified scene knowledge injection framework is introduced for knowledge representation. To evaluate the advantages of our proposed method, we instantiate Scene-MMKG considering typical indoor robotic functionalities (Manipulation and Mobility), named ManipMob-MMKG. Comparisons in characteristics indicate our instantiated ManipMob-MMKG has broad superiority on data-collection efficiency and knowledge quality. Experimental results on typical embodied tasks show that knowledge-enhanced methods using our instantiated ManipMob-MMKG can improve the performance obviously without re-designing model structures complexly. © 2024 IEEE. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#619
-
Sovrano 2023
An objective metric for Explainable AI: How and why to estimate the degree of explainability
This paper presents a new method for objectively measuring the explainability of textual information, such as the outputs of Explainable AI (XAI). We introduce a metric called Degree of Explainability (DoX), drawing inspiration from Ordinary Language Philosophy and Achinstein's theory of explanations. It assumes that the degree of explainability is directly proportional to the number of relevant questions that a piece of information can correctly answer. We have operationalized this concept by formalizing the DoX metric through a mathematical formula, which we have integrated into a software tool named DoXpy. DoXpy relies on pre-trained deep language models for knowledge extraction and answer retrieval in order to estimate the DoX, transforming our theoretical insights into a practical tool for real-world applications. To confirm the effectiveness and consistency of our approach, we conducted comprehensive experiments and user studies with over 190 participants. These studies evaluated the quality of explanations by healthcare and finance XAI-based software systems. Our results demonstrate a correlation between increases in objective explanation usability and increments in the DoX score. These findings suggest that the DoX metric is congruent with other mainstream explainability measures. It provides a more objective and cost-effective alternative to non-deterministic user studies. Thus, we discuss the potential of DoX as a tool to evaluate the legal compliance of XAI systems. By bridging the gap between theory and practice in Explainable AI, our work fosters transparency, understandability, and legal compliance. DoXpy and related materials have been made available online to ensure reproducibility. & COPY; 2023 Elsevier B.V. All rights reserved. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1295
-
Sreekantan 2022
Expert System for Question Answering on Anomalous Events and Mitigation Strategies Using Bidirectional Transformers and Knowledge Graphs
Society of Petroleum Engineers - ADIPEC 2022 2022;(): Society of Petroleum Engineers 2022 DOI: 10.2118/211855-MS · Ref ID: 5422 Daily drilling reports provide vital information for well planning as they capture anomalous events and mitigation measures during drilling operations. Previous works predominantly focus on search frameworks for information retrieval from these reports. However, the context between searches is lost, preventing users from narrowing down to the exact answer. Here, we present a transformer-based closed domain conversational agent for longer dialogues to guide users to contextual information for anomalous drilling events through natural language. Automated text extraction, cleaning and validation tasks are initially performed to resolve data quality issues prior to language modeling on a validated data set. Subsequently, a knowledge-based graph is created by node embedding using entity extractions and by learning the semantic-level relationships between entity nodes such as well names and events. Further, conversational agents are trained on the knowledge graphs for natural dialogue generation using neural machine translation models. Here, users' questions are translated into a query in a structured language that is evaluated directly over the knowledge graph in order to generate the desired answers. The workflow was tested on an asset with multiple wells experiencing several anomalous events during drilling such as stuck pipe, circulation losses and kicks. The end-to-end workflow was tested on its ability to retrieve anomalous events and present mitigation measures in the aforementioned data set based on the descriptions input by survey participants. Performance on the anomaly extraction, attribute mapping and mitigation performance were evaluated through F1 scores. A significantly high F1 score was recorded for anomaly extraction. This is predominantly driven by high precision due to explicit modeling of the reports as a knowledge graph. In addition to testing the workflow end to end, we tested the knowledge graph representation in isolation. For this, ranking metrics and triple classification with negative samples were used for the evaluation. The adjusted mean rank index was close to one, indicating high performance. Structured querying on the knowledge graphs also showed high accuracy for classifying anomalous events in the drilling report. The work described in this paper automates the end-to-end workflow for building an expert system for answering questions about anomalous events and mitigation strategies using daily drilling reports. Our novel approach using a knowledge graph with a transformer-based conversational agent enables users to perform detailed interactive investigation of anomalous events observed in daily drilling reports and create mitigation strategies. The workflow also allows for incorporating prior domain knowledge from drilling experts. Copyright © 2022, Society of Petroleum Engineers. |
yuexi
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3549
-
Stankevich 2024
Interpreting and learning voice commands with a Large Language Model for a robot system
arXiv 2024;(): 2024 Ref ID: 8498 Robots are increasingly common in industry and daily life, such as in nursing homes where they can assist staff. A key challenge is developing intuitive interfaces for easy communication. The use of Large Language Models (LLMs) like GPT-4 has enhanced robot capabilities, allowing for real-time interaction and decision-making. This integration improves robots' adaptability and functionality. This project focuses on merging LLMs with databases to improve decision-making and enable knowledge acquisition for request interpretation problems. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1241
-
Stavropoulos 2023
Empowering Knowledge Discovery from Scientific Literature: A novel approach to Research Artifact Analysis
3rd Workshop for Natural Language Processing Open Source Software, NLP-OSS 2023, Proceedings of the Workshop 2023;():37-53 Association for Computational Linguistics (ACL) 2023 DOI: 10.18653/v1/2023.nlposs-1.5 · Ref ID: 4916 Knowledge extraction from scientific literature is a major issue, crucial to promoting transparency, reproducibility, and innovation in the research community. In this work, we present a novel approach towards the identification, extraction and analysis of dataset and code/software mentions within scientific literature. We introduce a comprehensive dataset, synthetically generated by ChatGPT and meticulously curated, augmented, and expanded with real snippets of scientific text from full-text publications in Computer Science using a human-in-the-loop process. The dataset contains snippets highlighting mentions of the two research artifact (RA) types: dataset and code/software, along with insightful metadata including their Name, Version, License, URL as well as the intended Usage and Provenance. We also fine-tune a simple Large Language Model (LLM) using Low-Rank Adaptation (LoRA) to transform the Research Artifact Analysis (RAA) into an instruction-based Question Answering (QA) task. Ultimately, we report the improvements in performance on the test set of our dataset when compared to other base LLM models. Our method provides a significant step towards facilitating accurate, effective, and efficient extraction of datasets and software from scientific papers, contributing to the challenges of reproducibility and reusability in scientific research. © 2023 Association for Computational Linguistics. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#877
-
Steenwinckel 2021
Walk Extraction Strategies for Node Embeddings with RDF2Vec in Knowledge Graphs
32nd International Conference on Database and Expert Systems Applications (DEXA) 2021;1479():70-80 Electr Network Springer International Publishing Ag 2021 DOI: 10.1007/978-3-030-87101-7_8 · Ref ID: 3019 As Knowledge Graphs are symbolic constructs, specialized techniques have to be applied in order to make them compatible with data mining techniques. RDF2Vec is an unsupervised technique that can create task-agnostic numerical representations of the nodes in a KG by extending successful language modeling techniques. The original work proposed the Weisfeiler-Lehman kernel to improve the quality of the representations. However, in this work, we show that the Weisfeiler-Lehman kernel does little to improve walk embeddings in the context of a single Knowledge Graph. As an alternative, we examined five alternative strategies to extract information complementary to basic random walks and compare them on several benchmark datasets to show that research within this field is still relevant for node classification tasks. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2173
-
Steinegger 2016
Automatic generation of diagnostic handling code for decentralized PLC-based control architectures
2016 IEEE 21st International Conference on Emerging Technologies and Factory Automation (ETFA) 2016;():1-8 2016 DOI: 10.1109/ETFA.2016.7733694 · Ref ID: 7067 In this paper, an ontology-based approach to automatically generate control applications to handle diagnostic information of decentralized control devices is presented. Diagnostic possibilities of modern remote I/O devices are analyzed and software components in terms of function blocks to handle the specific diagnostic information are defined. After a detailed conceptual overview, the application of the proposed knowledge-based code generation approach to a PLC-based control architecture of a hot rolling mill is described. It is shown that the proposed approach significantly reduces engineering time and the error rate in the design processes of industrial control and diagnostic applications, since the application engineering is raised to an abstract level by utilizing pre-defined, tested, and reusable function blocks and a user-definable set of code generation rules to encode repetitive implementation tasks. The rules are defined in the query language SPARQL with additional ARQ functions to reduce the complexity. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3418
-
Steinigen 2024
Fact Finder – Enhancing Domain Expertise of Large Language Models by Incorporating Knowledge Graphs
arXiv 2024;(): 2024 Ref ID: 8511 Recent advancements in Large Language Models (LLMs) have showcased their proficiency in answering natural language queries. However, their effectiveness is hindered by limited domain-specific knowledge, raising concerns about the reliability of their responses. We introduce a hybrid system that augments LLMs with domain-specific knowledge graphs (KGs), thereby aiming to enhance factual correctness using a KG-based retrieval approach. We focus on a medical KG to demonstrate our methodology, which includes (1) pre-processing, (2) Cypher query generation, (3) Cypher query processing, (4) KG retrieval, and (5) LLM-enhanced response generation. We evaluate our system on a curated dataset of 69 samples, achieving a precision of 78% in retrieving correct KG nodes. Our findings indicate that the hybrid system surpasses a standalone LLM in accuracy and completeness, as verified by an LLM-as-a-Judge evaluation method. This positions the system as a promising tool for applications that demand factual correctness and completeness, such as target identification – a critical process in pinpointing biological entities for disease treatment or crop enhancement. Moreover, its intuitive search interface and ability to provide accurate responses within seconds make it well-suited for time-sensitive, precision-focused research contexts. We publish the source code together with the dataset and the prompt templates used. |
yuexi
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2594
-
Steinmetz 2018
Internet of Things Ontology for Digital Twin in Cyber Physical Systems
2018 VIII Brazilian Symposium on Computing Systems Engineering (SBESC) 2018;():154-159 2018 DOI: 10.1109/SBESC.2018.00030 · Ref ID: 6757 The Digital Twin is one of the most important concepts in the Cyber Physical Systems (CPS) era. It can bring benefits such as simulation, monitoring or management once it joins the physical and the virtual through the Internet of Things. This concept is being adopted more and more in the academia and in the industry, but there is still a lack of methods to define and formalize the representation of the Digital Twin, as for example semantic models. Ontologies are a way of representing knowledge that can be shared between different entities, allowing a common understanding about a information. In this sense, this work proposes an ontology to represent Digital Twin in the context of CPS and embedded systems. These concepts are implemented through a proposed architecture. The proposed ideas are being evaluated with industrial case studies and some of the preliminary results are described in the paper. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3078
-
Štolc 2010
A visual based framework for the model refactoring techniques
2010 IEEE 8th International Symposium on Applied Machine Intelligence and Informatics (SAMI) 2010;():72-82 2010 DOI: 10.1109/SAMI.2010.5423766 · Ref ID: 6520 Refactoring is one of the most important rules and practices of Extreme Programming from the family of the Agile Methodologies. We propose the tool to refactor the UML model (Class Diagrams for now). In the first step we need to find the flaws (bad smells) in the model with the OCL query and then in the second step we transform the flaw to the correct fragment with the transformation script. The paper presents the set of methods and tools for the model adjustment, cooperating with the CASE systems. We analyze the concept and algorithms for the refactoring, OCL queries and transformation scripts generating. We have prepared functional prototype of the editor for the refactoring rules definition, OCL query generator and the transformation script generator. In the future, we plan to extend the framework with alternative notations (e.g., QVT graph transformation rules, PICS, Viatra2) and the other techniques to find the flaws (e.g., rule-based system with predicates of the bad smells, XMI transformations and Abstract Syntax Tree algebra, Bit-Vector and Similarity Scoring Algorithms). |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3034
-
Stramandinoli 2011
Towards the grounding of abstract words: A Neural Network model for cognitive robots
The 2011 International Joint Conference on Neural Networks 2011;():467-474 2011 DOI: 10.1109/IJCNN.2011.6033258 · Ref ID: 6277 In this paper, a model based on Artificial Neural Networks (ANNs) extends the symbol grounding mechanism to abstract words for cognitive robots. The aim of this work is to obtain a semantic representation of abstract concepts through the grounding in sensorimotor experiences for a humanoid robotic platform. Simulation experiments have been developed on a software environment for the iCub robot. Words that express general actions with a sensorimotor component are first taught to the simulated robot. During the training stage the robot first learns to perform a set of basic action primitives through the mechanism of direct grounding. Subsequently, the grounding of action primitives, acquired via direct sensorimotor experience, is transferred to higher-order words via linguistic descriptions. The idea is that by combining words grounded in sensorimotor experience the simulated robot can acquire more abstract concepts. The experiments aim to teach the robot the meaning of abstract words by making it experience sensorimotor actions. The iCub humanoid robot will be used for testing experiments on a real robotic architecture. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2233
-
Su 2009
A Chinese Document Retrieval Method Enhanced by Concept Base
2009 WRI World Congress on Computer Science and Information Engineering 2009;5():200-203 2009 DOI: 10.1109/CSIE.2009.496 · Ref ID: 6440 Full-text searching techniques have been extensively used in the area of information retrieval. However, the full-text searching techniques are often insufficient to retrieve meaningful or valuable documents since the basic idea of these techniques is word or phrase matching, not concept matching. A Chinese document retrieval method enhanced by concept base is proposed in this paper. The main idea of this method is to build a common Chinese concept base to provide a shared understanding of concepts. This enhanced method can take advantage of the concept base when analyzing and indexing documents, and when searching documents. The document management system can use this method to improve the retrieval performance. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2018
-
Su 2024
Unsupervised Real-Time Hallucination Detection based on the Internal States of Large Language Models
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():14379-14391 Association for Computational Linguistics (ACL) 2024 Ref ID: 4404 Hallucinations in large language models (LLMs) refer to the phenomenon of LLMs producing responses that are coherent yet factually inaccurate. This issue undermines the effectiveness of LLMs in practical applications, necessitating research into detecting and mitigating hallucinations of LLMs. Previous studies have mainly concentrated on post-processing techniques for hallucination detection, which tend to be computationally intensive and limited in effectiveness due to their separation from the LLM's inference process. To overcome these limitations, we introduce MIND, an unsupervised training framework that leverages the internal states of LLMs for real-time hallucination detection without requiring manual annotations. Additionally, we present HELM, a new benchmark for evaluating hallucination detection across multiple LLMs, featuring diverse LLM outputs and the internal states of LLMs during their inference process. Our experiments demonstrate that MIND outperforms existing state-of-the-art methods in hallucination detection. © 2024 Association for Computational Linguistics. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1637
-
Su 2023
MeKB-Rec: Personal Knowledge Graph Learning for Cross-Domain Recommendation
CEUR Workshop Proceedings 2023;3560():90-102 CEUR-WS 2023 Ref ID: 5028 It is a long-standing challenge in modern recommender systems to make recommendations for new users, namely the cold-start problem. Cross-Domain Recommendation (CDR) has been proposed to address this challenge, but current ways to represent users’ interests across systems are still severely limited. We introduce Personal Knowledge Graph (PKG) as a domain-invariant interest representation, and propose a novel CDR paradigm named MeKB-Rec. We first link users and entities in a knowledge base to construct a PKG of users’ interests, named MeKB. Then we learn a semantic representation of MeKB for the cross-domain recommendation. Beyond most existing systems, our approach builds a semantic mapping across domains using Pretrained Language Models which breaks the requirement for in-domain user behaviors, enabling zero-shot recommendations for new users in a low-resource domain. We experiment MeKB-Rec on well-established public CDR datasets, and demonstrate that the new formulation achieves a new state-of-the-art that significantly improves HR@10 and NDCG@10 metrics over best previous approaches by 24%–91%, with a 105% improvement for HR@10 of zero-shot users with no behavior in the target domain. We deploy MeKB-Rec in WeiXin recommendation scenarios and achieve significant gains in core online metrics. MeKB-Rec is now serving hundreds of millions of users in real-world products. © 2023 Copyright for this paper by its authors. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1253
-
Su 2024
Enhancing Exploratory Testing by Large Language Model and Knowledge Graph
Proceedings - International Conference on Software Engineering 2024;():1197-1208 IEEE Computer Society 2024 DOI: 10.1145/3597503.3639157 · Ref ID: 4640 Exploratory testing leverages the tester's knowledge and creativity to design test cases for effectively uncovering system-level bugs from the end user's perspective. Researchers have worked on test scenario generation to support exploratory testing based on a system knowledge graph, enriched with scenario and oracle knowledge from bug reports. Nevertheless, the adoption of this approach is hindered by difficulties in handling bug reports of inconsistent quality and varied expression styles, along with the infeasibility of the generated test scenarios. To overcome these limitations, we utilize the superior natural language understanding (NLU) capabilities of Large Language Models (LLMs) to construct a System KG of User Tasks and Failures (SysKG-UTF). Leveraging the system and bug knowledge from the KG, along with the logical reasoning capabilities of LLMs, we generate test scenarios with high feasibility and coherence. Particularly, we design chain-of-thought (CoT) reasoning to extract human-like knowledge and logical reasoning from LLMs, simulating a developer's process of validating test scenario feasibility. Our evaluation shows that our approach significantly enhances the KG construction, particularly for bug reports with low quality. Furthermore, our approach generates test scenarios with high feasibility and coherence. The user study further proves the effectiveness of our generated test scenarios in supporting exploratory testing. Specifically, 8 participants find 36 bugs from 8 seed bugs in two hours using our test scenarios, a significant improvement over the 21 bugs found by the state-of-the-art baseline. © 2024 ACM. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1622
-
Subramanian 2024
M-QALM: A Benchmark to Assess Clinical Reading Comprehension and Knowledge Recall in Large Language Models via Question Answering
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():4002-4042 Association for Computational Linguistics (ACL) 2024 Ref ID: 4403 There is vivid research on adapting Large Language Models (LLMs) to perform a variety of tasks in high-stakes domains such as healthcare. Despite their popularity, there is a lack of understanding of the extent and contributing factors that allow LLMs to recall relevant knowledge and combine it with presented information in the clinical and biomedical domain-a fundamental pre-requisite for success on downstream tasks. Addressing this gap, we use Multiple Choice and Abstractive Question Answering to conduct a large-scale empirical study on 22 datasets in three generalist and three specialist biomedical sub-domains. Our multifaceted analysis of the performance of 15 LLMs, further broken down by sub-domain, source of knowledge and model architecture, uncovers success factors such as instruction tuning that lead to improved recall and comprehension. We further show that while recently proposed domain-adapted models may lack adequate knowledge, directly fine-tuning on our collected medical knowledge datasets shows encouraging results, even generalising to unseen specialist sub-domains. We complement the quantitative results with a skill-oriented manual error analysis, which reveals a significant gap between the models' capabilities to simply recall necessary knowledge and to integrate it with the presented context. To foster research and collaboration in this field we share M-QALM-our resources, standard-ised methodology, and evaluation results-with the research community to facilitate further advancements in clinical knowledge representation learning within language models. © 2024 Association for Computational Linguistics. |
Mike
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1492
-
Suchanek 2023
Knowledge Bases and Language Models: Complementing Forces
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2023;14244 LNCS():3-15 Springer Science and Business Media Deutschland GmbH 2023 DOI: 10.1007/978-3-031-45072-3_1 · Ref ID: 5185 Large language models (LLMs), as a particular instance of generative artificial intelligence, have revolutionized natural language processing. In this invited paper, we argue that LLMs are complementary to structured data repositories such as databases or knowledge bases, which use symbolic knowledge representations. Hence, the two ways of knowledge representation will likely continue to co-exist, at least in the near future. We discuss ways that have been explored to make the two approaches work together, and point out opportunities and challenges for their symbiosis. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3559
-
Sukhwal 2024
A Joint-Reasoning based Disease Q&A System
arXiv 2024;(): 2024 Ref ID: 8027 Medical question answer (QA) assistants respond to lay users' health-related queries by synthesizing information from multiple sources using natural language processing and related techniques. They can serve as vital tools to alleviate issues of misinformation, information overload, and complexity of medical language, thus addressing lay users' information needs while reducing the burden on healthcare professionals. QA systems, the engines of such assistants, have typically used either language models (LMs) or knowledge graphs (KG), though the approaches could be complementary. LM-based QA systems excel at understanding complex questions and providing well-formed answers, but are prone to factual mistakes. KG-based QA systems, which represent facts well, are mostly limited to answering short-answer questions with pre-created templates. While a few studies have jointly used LM and KG approaches for text-based QA, this was done to answer multiple-choice questions. Extant QA systems also have limitations in terms of automation and performance. We address these challenges by designing a novel, automated disease QA system which effectively utilizes both LM and KG techniques through a joint-reasoning approach to answer disease-related questions appropriate for lay users. Our evaluation of the system using a range of quality metrics demonstrates its efficacy over benchmark systems, including the popular ChatGPT. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1001
-
Sumanathilaka 2024
Assessing GPT's Potential for Word Sense Disambiguation: A Quantitative Evaluation on Prompt Engineering Techniques
2024 IEEE 15th Control and System Graduate Research Colloquium, ICSGRC 2024 - Conference Proceeding 2024;():204-209 Institute of Electrical and Electronics Engineers Inc. 2024 DOI: 10.1109/ICSGRC62081.2024.10691283 · Ref ID: 4163 Modern digital communications (including social media content) often contain ambiguous words due to their potential for multiple related interpretations (polysemy). This ambiguity poses challenges for traditional Word Sense Disambiguation (WSD) methods, which struggle with limited data and lack of contextual understanding. These limitations hinder efficient translation, information retrieval, and question-answering systems, thereby restricting the benefits of computational linguistics techniques when applied to digital communication technologies. Our research investigates the use of Large Language Models (LLMs) to improve WSD using various prompt engineering techniques. We propose and evaluate a novel method that combines a knowledge graph, together with Part-of-Speech (POS) tagging and few-shot prompting to guide LLMs. By utilizing prompt augmentation with human-in-loop on few-shot prompt approaches, this work demonstrates a substantial improvement in WSD. This research advances accurate word interpretation in digital communications, leading to important implications for improved translation systems, better search results, and more intelligent question-answering technology. © 2024 IEEE. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3204
-
Sumpter 2024
Automated Generation of High-Quality Medical Simulation Scenarios Through Integration of Semi-Structured Data and Large Language Models
arXiv 2024;(): 2024 Ref ID: 8265 This study introduces a transformative framework for medical education by integrating semi-structured data with Large Language Models (LLMs), primarily OpenAIs ChatGPT3.5, to automate the creation of medical simulation scenarios. Traditionally, developing these scenarios was a time-intensive process with limited flexibility to meet diverse educational needs. The proposed approach utilizes AI to efficiently generate detailed, clinically relevant scenarios that are tailored to specific educational objectives. This innovation has significantly reduced the time and resources required for scenario development, allowing for a broader variety of simulations. Preliminary feedback from educators and learners has shown enhanced engagement and improved knowledge acquisition, confirming the effectiveness of this AI-enhanced methodology in simulation-based learning. The integration of structured data with LLMs not only streamlines the creation process but also offers a scalable, dynamic solution that could revolutionize medical training, highlighting the critical role of AI in advancing educational outcomes and patient care standards. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#161
-
Sun 2021
Deep learning with language models improves named entity recognition for PharmaCoNER
Background The recognition of pharmacological substances, compounds and proteins is essential for biomedical relation extraction, knowledge graph construction, drug discovery, as well as medical question answering. Although considerable efforts have been made to recognize biomedical entities in English texts, to date, only few limited attempts were made to recognize them from biomedical texts in other languages. PharmaCoNER is a named entity recognition challenge to recognize pharmacological entities from Spanish texts. Because there are currently abundant resources in the field of natural language processing, how to leverage these resources to the PharmaCoNER challenge is a meaningful study. Methods Inspired by the success of deep learning with language models, we compare and explore various representative BERT models to promote the development of the PharmaCoNER task. Results The experimental results show that deep learning with language models can effectively improve model performance on the PharmaCoNER dataset. Our method achieves state-of-the-art performance on the PharmaCoNER dataset, with a max F1-score of 92.01%. Conclusion For the BERT models on the PharmaCoNER dataset, biomedical domain knowledge has a greater impact on model performance than the native language (i.e., Spanish). The BERT models can obtain competitive performance by using WordPiece to alleviate the out of vocabulary limitation. The performance on the BERT model can be further improved by constructing a specific vocabulary based on domain knowledge. Moreover, the character case also has a certain impact on model performance. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3761
-
Sun 2024
Persona-DB: Efficient Large Language Model Personalization for Response Prediction with Collaborative Data Refinement
arXiv 2024;(): 2024 Ref ID: 8109 The increasing demand for personalized interactions with large language models (LLMs) calls for methodologies capable of accurately and efficiently identifying user opinions and preferences. Retrieval augmentation emerges as an effective strategy, as it can accommodate a vast number of users without the costs from fine-tuning. Existing research, however, has largely focused on enhancing the retrieval stage and devoted limited exploration toward optimizing the representation of the database, a crucial aspect for tasks such as personalization. In this work, we examine the problem from a novel angle, focusing on how data can be better represented for more data-efficient retrieval in the context of LLM customization. To tackle this challenge, we introduce Persona-DB, a simple yet effective framework consisting of a hierarchical construction process to improve generalization across task contexts and collaborative refinement to effectively bridge knowledge gaps among users. In the evaluation of response prediction, Persona-DB demonstrates superior context efficiency in maintaining accuracy with a significantly reduced retrieval size, a critical advantage in scenarios with extensive histories or limited context windows. Our experiments also indicate a marked improvement of over 10% under cold-start scenarios, when users have extremely sparse data. Furthermore, our analysis reveals the increasing importance of collaborative knowledge as the retrieval capacity expands. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3594
-
Sun 2024
Knowledge Graph Tuning: Real-time Large Language Model Personalization based on Human Feedback
arXiv 2024;(): 2024 Ref ID: 8330 Large language models (LLMs) have demonstrated remarkable proficiency in a range of natural language processing tasks. Once deployed, LLMs encounter users with personalized factual knowledge, and such personalized knowledge is consistently reflected through users' interactions with the LLMs. To enhance user experience, real-time model personalization is essential, allowing LLMs to adapt user-specific knowledge based on user feedback during human-LLM interactions. Existing methods mostly require back-propagation to finetune the model parameters, which incurs high computational and memory costs. In addition, these methods suffer from low interpretability, which will cause unforeseen impacts on model performance during long-term use, where the user's personalized knowledge is accumulated extensively.To address these challenges, we propose Knowledge Graph Tuning (KGT), a novel approach that leverages knowledge graphs (KGs) to personalize LLMs. KGT extracts personalized factual knowledge triples from users' queries and feedback and optimizes KGs without modifying the LLM parameters. Our method improves computational and memory efficiency by avoiding back-propagation and ensures interpretability by making the KG adjustments comprehensible to humans.Experiments with state-of-the-art LLMs, including GPT-2, Llama2, and Llama3, show that KGT significantly improves personalization performance while reducing latency and GPU memory costs. Ultimately, KGT offers a promising solution of effective, efficient, and interpretable real-time LLM personalization during user interactions with the LLMs. |
Kwesi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#32
-
Sun 2022
Assessing Scientific Research Papers with Knowledge Graphs
45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) 2022;():2467-2472 Madrid, SPAIN Assoc Computing Machinery 2022 DOI: 10.1145/3477495.3531879 · Ref ID: 3066 In recent decades, the growing scale of scientific research has led to numerous novel findings. Reproducing these findings is the foundation of future research. However, due to the complexity of experiments, manually assessing scientific research is laborious and time-intensive, especially in social and behavioral sciences. Although increasing reproducibility studies have garnered increased attention in the research community, there is still a lack of systematic ways for evaluating scientific research at scale. In this paper, we propose a novel approach towards automatically assessing scientific publications by constructing a knowledge graph (KG) that captures a holistic view of the research contributions. Specifically, during the KG construction, we combine information from two different perspectives: micro-level features that capture knowledge from published articles such as sample sizes, effect sizes, and experimental models, and macro-level features that comprise relationships between entities such as authorship and reference information. We then learn low-dimensional representations using language models and knowledge graph embeddings for entities (nodes in KGs), which are further used for the assessments. A comprehensive set of experiments on two benchmark datasets shows the usefulness of leveraging KGs for scoring scientific research. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1372
-
Sun 2024
Head-to-Tail: How Knowledgeable are Large Language Models (LLMs)? A.K.A. Will LLMs Replace Knowledge Graphs?
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2024 2024;1():311-325 Association for Computational Linguistics (ACL) 2024 Ref ID: 4459 Since the recent prosperity of Large Language Models (LLMs), there have been interleaved discussions regarding how to reduce hallucinations from LLM responses, how to increase the factuality of LLMs, and whether Knowledge Graphs (KGs), which store the world knowledge in a symbolic form, will be replaced with LLMs. In this paper, we try to answer these questions from a new angle: How knowledgeable are LLMs? To answer this question, we constructed Head-to-Tail, a benchmark that consists of 18K question-answer (QA) pairs regarding head, torso, and tail facts in terms of popularity. We designed an automated evaluation method and a set of metrics that closely approximate the knowledge an LLM confidently internalizes. Through a comprehensive evaluation of 16 publicly available LLMs, we show that existing LLMs are still far from being perfect in terms of their grasp of factual knowledge, especially for facts of torso-to-tail entities. ©2024 Association for Computational Linguistics. |
yuexi
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#260
-
Sun 2024
Exploring sequence-to-sequence taxonomy expansion via language model probing
Taxonomy is a knowledge graph of concept hierarchy which plays a significant role in semantic entailment and is widely used in many downstream natural language processing tasks. Distinct from building a taxonomy from scratch, the task of taxonomy expansion aims at enriching an existing taxonomy by adding new concepts. However, existing methods often construct only part of semantic relationships for representing the taxonomy, which may ignore sufficient features. Meanwhile, as many recent models usually take this task in insertion only manner, they preserve limitations when the new concept is not an insertion to taxonomy. Therefore, we propose TaxoSeq, a method that converts the task of taxonomy expansion into a sequence to sequence setting, thereby effectively exploiting the entire structural features and naturally dealing with more expansion cases. Empowered by pre-trained language models such as T5, our approach is shown to achieve significant progress over other methods in SemEval's three publicly benchmark datasets. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1305
-
Sun 2024
Exploring sequence-to-sequence taxonomy expansion via language model probing[Formula presented]
Taxonomy is a knowledge graph of concept hierarchy which plays a significant role in semantic entailment and is widely used in many downstream natural language processing tasks. Distinct from building a taxonomy from scratch, the task of taxonomy expansion aims at enriching an existing taxonomy by adding new concepts. However, existing methods often construct only part of semantic relationships for representing the taxonomy, which may ignore sufficient features. Meanwhile, as many recent models usually take this task in insertion-only manner, they preserve limitations when the new concept is not an insertion to taxonomy. Therefore, we propose TaxoSeq, a method that converts the task of taxonomy expansion into a sequence to sequence setting, thereby effectively exploiting the entire structural features and naturally dealing with more expansion cases. Empowered by pre-trained language models such as T5, our approach is shown to achieve significant progress over other methods in SemEval's three publicly benchmark datasets. © 2023 Elsevier Ltd |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1695
-
Sun 2024
ODA: Observation-Driven Agent for integrating LLMs and Knowledge Graphs
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():7417-7431 Association for Computational Linguistics (ACL) 2024 Ref ID: 4165 The integration of Large Language Models (LLMs) and knowledge graphs (KGs) has achieved remarkable success in various natural language processing tasks. However, existing methodologies that integrate LLMs and KGs often navigate the task-solving process solely based on the LLM's analysis of the question, overlooking the rich cognitive potential inherent in the vast knowledge encapsulated in KGs. To address this, we introduce Observation-Driven Agent (ODA), a novel AI agent framework tailored for tasks involving KGs. ODA incorporates KG reasoning abilities via global observation, which enhances reasoning capabilities through a cyclical paradigm of observation, action, and reflection. Confronting the exponential explosion of knowledge during observation, we innovatively design a recursive observation mechanism. Subsequently, we integrate the observed knowledge into the action and reflection modules. Through extensive experiments, ODA demonstrates state-of-the-art performance on several datasets, notably achieving accuracy improvements of 12.87% and 8.9%. Our code and data are available on https://github.com/lanjiuqing64/KGdata. © 2024 Association for Computational Linguistics. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3662
-
Sun 2024
LlamaCare: A Large Medical Language Model for Enhancing Healthcare Knowledge Sharing
arXiv 2024;(): 2024 Ref ID: 8351 Large language models (LLMs) have shown amazing capabilities in knowledge memorization and the present. However, when it comes to domain-specific knowledge and downstream tasks like medical, general LLMs are often unable to give precise answers. In addition, when people want LLMs to answer classification questions, they usually go through instruction tuning first. However, LLMs do not always give a direct index of the categorization after instruction tuning. In this paper, we proposed LlamaCare, a fine-tuned medical language model, and Extended Classification Integration(ECI), a module to handle classification problems of LLMs. Our contributions are : (i) We fine-tuned a large language model of medical knowledge with very low carbon emissions and achieved similar performance with ChatGPT by a 24G GPU. (ii) We solved the problem of redundant categorical answers and improved the performance of LLMs by proposing a new module called Extended Classification Integration. (iii) We released our processed data for one-shot and few-shot training for some benchmarks such as PubMedQA and USMLE 1-3 step. Our method achieves a close performance comparable to some state-of-the-art models with the same quantity of parameters on benchmarks, while being more environmentally friendly by using less GPU computation time. Our models, codes, and datasets can be found at \url{https://github.com/Stephen-SMJ/LLamaCare}. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1121
-
Sun 2024
Consistency Guided Knowledge Retrieval and Denoising in LLMs for Zero-shot Document-level Relation Triplet Extraction
WWW 2024 - Proceedings of the ACM Web Conference 2024;():4407-4416 Association for Computing Machinery, Inc 2024 DOI: 10.1145/3589334.3645678 · Ref ID: 4030 Document-level Relation Triplet Extraction (DocRTE) is a fundamental task in information systems that aims to simultaneously extract entities with semantic relations from a document. Existing methods heavily rely on a substantial amount of fully labeled data. However, collecting and annotating data for newly emerging relations is time-consuming and labor-intensive. Recent advanced Large Language Models (LLMs), such as ChatGPT and LLaMA, exhibit impressive long-text generation capabilities, inspiring us to explore an alternative approach for obtaining auto-labeled documents with new relations. In this paper, we propose a Zero-shot Document-level Relation Triplet Extraction (ZeroDocRTE) framework, which Generates labeled data by Retrieval and Denoising Knowledge from LLMs, called GenRDK. Specifically, we propose a chain-of-retrieval prompt to guide ChatGPT to generate labeled long-text data step by step. To improve the quality of synthetic data, we propose a denoising strategy based on the consistency of cross-document knowledge. Leveraging our denoised synthetic data, we proceed to fine-tune the LLaMA2-13B-Chat for extracting document-level relation triplets. We perform experiments for both zero-shot document-level relation and triplet extraction on two public datasets. The experimental results illustrate that our GenRDK framework outperforms strong baselines. © 2024 ACM. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1841
-
Sun 2024
Root Cause Analysis for Industrial Process Anomalies through the Integration of Knowledge Graph and Large Language Model
Chinese Control Conference, CCC 2024;():6855-6860 IEEE Computer Society 2024 DOI: 10.23919/CCC63176.2024.10662704 · Ref ID: 4155 Root cause analysis for industrial process anomalies is critical for manufacturing activities. Industrial process alarms can provide crucial information to enable root cause analysis. However, the complex system structure causes a large number of alarms to emerge at the same time. To address this issue, we proposed an approach that utilizes knowledge graphs and large language models to provide comprehensible root cause analysis. Firstly, we extract knowledge such as historical anomalies from catalytic cracking operation manuals to construct an industrial process safety knowledge graph. Then, named entities in each alarm are extracted as keywords to retrieve factual knowledge from the knowledge graph. Finally, factual knowledge will be provided to the large language model as prior knowledge to infer the root cause of anomalies. Experimental results show that the proposed approach can accurately identify the root cause, thereby ensuring the safety of industrial processes. © 2024 Technical Committee on Control Theory, Chinese Association of Automation. |
yuexi
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1091
-
Sun 2020
CoLAKE: Contextualized Language and Knowledge Embedding
COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference 2020;():3660-3670 Association for Computational Linguistics (ACL) 2020 Ref ID: 5747 With the emerging branch of incorporating factual knowledge into pre-trained language models such as BERT, most existing models consider shallow, static, and separately pre-trained entity embeddings, which limits the performance gains of these models. Few works explore the potential of deep contextualized knowledge representation when injecting knowledge. In this paper, we propose the Contextualized Language and Knowledge Embedding (CoLAKE), which jointly learns contextualized representation for both language and knowledge with the extended MLM objective. Instead of injecting only entity embeddings, CoLAKE extracts the knowledge context of an entity from large-scale knowledge bases. To handle the heterogeneity of knowledge context and language context, we integrate them in a unified data structure, word-knowledge graph (WK graph). CoLAKE is pre-trained on large-scale WK graphs with the modified Transformer encoder. We conduct experiments on knowledge-driven tasks, knowledge probing tasks, and language understanding tasks. Experimental results show that CoLAKE outperforms previous counterparts on most of the tasks. Besides, CoLAKE achieves surprisingly high performance on our synthetic task called word-knowledge graph completion, which shows the superiority of simultaneously contextualizing language and knowledge representation. © 2020 COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference. All rights reserved. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1100
-
Sun 2023
Combining Structure Embedding and Text Semantics for Efficient Knowledge Graph Completion
Proceedings of the International Conference on Software Engineering and Knowledge Engineering, SEKE 2023;2023-July():317-322 Knowledge Systems Institute Graduate School 2023 DOI: 10.18293/SEKE2023-100 · Ref ID: 5291 Knowledge graph completion plays a crucial role in downstream applications. However, existing methods tend to only rely on the structure or textual information, resulting in suboptimal model performance. Moreover, recent attempts to leverage pre-trained language models to complete knowledge graphs have proved unsatisfactory. To overcome these limitations, we propose a novel model that combines structural embedding and semantic information of the knowledge graph. Compared with previous works based on pre-trained language models, our model can better use the implicit knowledge of pre-trained language models by using relation templates, entity definitions, and learnable tokens. Furthermore, our model employs a multi-head attention mechanism to transform the embedding semantic space of entities and relations obtained from the knowledge graph embedding model, thereby enhancing their expressiveness and unifying the semantic space of both types of information. Finally, we utilize convolutional neural networks to extract features from the matrices created by combining these two types of information for link prediction and triplet classification tasks. Empirical evaluations on two knowledge graph completion datasets demonstrate that our model is effective for both tasks. © 2023 Knowledge Systems Institute Graduate School. All rights reserved. |
Davis
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3785
-
Sun 2024
Prompt-Consistency Image Generation (PCIG): A Unified Framework Integrating LLMs, Knowledge Graphs, and Controllable Diffusion Models
arXiv 2024;(): 2024 Ref ID: 8415 The rapid advancement of Text-to-Image(T2I) generative models has enabled the synthesis of high-quality images guided by textual descriptions. Despite this significant progress, these models are often susceptible in generating contents that contradict the input text, which poses a challenge to their reliability and practical deployment. To address this problem, we introduce a novel diffusion-based framework to significantly enhance the alignment of generated images with their corresponding descriptions, addressing the inconsistency between visual output and textual input. Our framework is built upon a comprehensive analysis of inconsistency phenomena, categorizing them based on their manifestation in the image. Leveraging a state-of-the-art large language module, we first extract objects and construct a knowledge graph to predict the locations of these objects in potentially generated images. We then integrate a state-of-the-art controllable image generation model with a visual text generation module to generate an image that is consistent with the original prompt, guided by the predicted object locations. Through extensive experiments on an advanced multimodal hallucination benchmark, we demonstrate the efficacy of our approach in accurately generating the images without the inconsistency with the original prompt. The code can be accessed via https://github.com/TruthAI-Lab/PCIG. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3671
-
Sun 2024
LLM4Vuln: A Unified Evaluation Framework for Decoupling and Enhancing LLMs' Vulnerability Reasoning
arXiv 2024;(): 2024 Ref ID: 8053 Large language models (LLMs) have demonstrated significant potential in various tasks, including vulnerability detection. However, current efforts in this area are preliminary, lacking clarity on whether LLMs' vulnerability reasoning capabilities stem from the models themselves or external aids such as knowledge retrieval and tooling support. This paper aims to isolate LLMs' vulnerability reasoning from other capabilities, such as vulnerability knowledge adoption, context information retrieval, and structured output generation. We introduce LLM4Vuln, a unified evaluation framework that separates and assesses LLMs' vulnerability reasoning capabilities and examines improvements when combined with other enhancements. We conducted controlled experiments with 97 ground-truth vulnerabilities and 97 non-vulnerable cases in Solidity and Java, testing them in a total of 9,312 scenarios across four LLMs (GPT-4, GPT-3.5, Mixtral, and Llama 3). Our findings reveal the varying impacts of knowledge enhancement, context supplementation, prompt schemes, and models. Additionally, we identified 14 zero-day vulnerabilities in four pilot bug bounty programs, resulting in $3,576 in bounties. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2842
-
Sun 2022
Research and Application of Automatic Text Summarization Technology Based on Deep Learning
2022 11th International Conference of Information and Communication Technology (ICTech)) 2022;():225-229 2022 DOI: 10.1109/ICTech55460.2022.00052 · Ref ID: 6391 It takes a lot of time and energy for users to obtain useful information from the massive data generated by the Internet. The text abstract is a refined expression of the content of the article, which can summarize the main content of the article. Text summarization technology can quickly allow users to obtain information that is valuable to them, and to a certain extent alleviate the problem of information overload in the era of big data. In this paper, we use the knowledge enhancement model to learn the semantic relationship of the real world by modeling the entity concept and other prior semantic knowledge in massive data, so as to overcome the disadvantage of using only the original language signal in the previous language model. Then the generative pre-training model is used to solve some specific problems in natural language generation, such as the exposure bias problem. The experimental results show that the model used in this paper works well on the Gigaword and CNN / DailyMail data sets. At the same time, the abstract generated on the nlpcc2017 Chinese abstract data has good accuracy and readability. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3588
-
Sun 2024
Knowledge Graph in Astronomical Research with Large Language Models: Quantifying Driving Forces in Interdisciplinary Scientific Discovery
arXiv 2024;(): 2024 Ref ID: 8347 Identifying and predicting the factors that contribute to the success of interdisciplinary research is crucial for advancing scientific discovery. However, there is a lack of methods to quantify the integration of new ideas and technological advancements in astronomical research and how these new technologies drive further scientific breakthroughs. Large language models, with their ability to extract key concepts from vast literature beyond keyword searches, provide a new tool to quantify such processes. In this study, we extracted concepts in astronomical research from 297,807 publications between 1993 and 2024 using large language models, resulting in a set of 24,939 concepts. These concepts were then used to form a knowledge graph, where the link strength between any two concepts was determined by their relevance through the citation-reference relationships. By calculating this relevance across different time periods, we quantified the impact of numerical simulations and machine learning on astronomical research. The knowledge graph demonstrates two phases of development: a phase where the technology was integrated and another where the technology was explored in scientific discovery. The knowledge graph reveals that despite machine learning has made much inroad in astronomy, there is currently a lack of new concept development at the intersection of AI and Astronomy, which may be the current bottleneck preventing machine learning from further transforming the field of astronomy. |
Mike
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3622
-
Suri 2023
Language Models sounds the Death Knell of Knowledge Graphs
arXiv 2023;(): 2023 Ref ID: 7633 Healthcare domain generates a lot of unstructured and semi-structured text. Natural Language processing (NLP) has been used extensively to process this data. Deep Learning based NLP especially Large Language Models (LLMs) such as BERT have found broad acceptance and are used extensively for many applications. A Language Model is a probability distribution over a word sequence. Self-supervised Learning on a large corpus of data automatically generates deep learning-based language models. BioBERT and Med-BERT are language models pre-trained for the healthcare domain. Healthcare uses typical NLP tasks such as question answering, information extraction, named entity recognition, and search to simplify and improve processes. However, to ensure robust application of the results, NLP practitioners need to normalize and standardize them. One of the main ways of achieving normalization and standardization is the use of Knowledge Graphs. A Knowledge Graph captures concepts and their relationships for a specific domain, but their creation is time-consuming and requires manual intervention from domain experts, which can prove expensive. SNOMED CT (Systematized Nomenclature of Medicine – Clinical Terms), Unified Medical Language System (UMLS), and Gene Ontology (GO) are popular ontologies from the healthcare domain. SNOMED CT and UMLS capture concepts such as disease, symptoms and diagnosis and GO is the world's largest source of information on the functions of genes. Healthcare has been dealing with an explosion in information about different types of drugs, diseases, and procedures. This paper argues that using Knowledge Graphs is not the best solution for solving problems in this domain. We present experiments using LLMs for the healthcare domain to demonstrate that language models provide the same functionality as knowledge graphs, thereby making knowledge graphs redundant. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3593
-
Susanti 2024
Knowledge Graph Structure as Prompt: Improving Small Language Models Capabilities for Knowledge-based Causal Discovery
arXiv 2024;(): 2024 Ref ID: 8487 Causal discovery aims to estimate causal structures among variables based on observational data. Large Language Models (LLMs) offer a fresh perspective to tackle the causal discovery problem by reasoning on the metadata associated with variables rather than their actual data values, an approach referred to as knowledge-based causal discovery. In this paper, we investigate the capabilities of Small Language Models (SLMs, defined as LLMs with fewer than 1 billion parameters) with prompt-based learning for knowledge-based causal discovery. Specifically, we present KG Structure as Prompt, a novel approach for integrating structural information from a knowledge graph, such as common neighbor nodes and metapaths, into prompt-based learning to enhance the capabilities of SLMs. Experimental results on three types of biomedical and open-domain datasets under few-shot settings demonstrate the effectiveness of our approach, surpassing most baselines and even conventional fine-tuning approaches trained on full datasets. Our findings further highlight the strong capabilities of SLMs: in combination with knowledge graphs and prompt-based learning, SLMs demonstrate the potential to surpass LLMs with larger number of parameters. Our code and datasets are available on GitHub. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3311
-
Takahashi 2024
The Curse of Popularity: Popular Entities have Catastrophic Side Effects when Deleting Knowledge from Language Models
arXiv 2024;(): 2024 Ref ID: 8367 Language models (LMs) encode world knowledge in their internal parameters through training. However, LMs may learn personal and confidential information from the training data, leading to privacy concerns such as data leakage. Therefore, research on knowledge deletion from LMs is essential. This study focuses on the knowledge stored in LMs and analyzes the relationship between the side effects of knowledge deletion and the entities related to the knowledge. Our findings reveal that deleting knowledge related to popular entities can have catastrophic side effects. Furthermore, this research is the first to analyze knowledge deletion in models trained on synthetic knowledge graphs, indicating a new direction for controlled experiments. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3519
-
Talukdar 2024
Improving Large Language Model (LLM) fidelity through context-aware grounding: A systematic approach to reliability and veracity
arXiv 2024;(): 2024 Ref ID: 8517 As Large Language Models (LLMs) become increasingly sophisticated and ubiquitous in natural language processing (NLP) applications, ensuring their robustness, trustworthiness, and alignment with human values has become a critical challenge. This paper presents a novel framework for contextual grounding in textual models, with a particular emphasis on the Context Representation stage. Our approach aims to enhance the reliability and ethical alignment of these models through a comprehensive, context-aware methodology. By explicitly capturing and representing relevant situational, cultural, and ethical contexts in a machine-readable format, we lay the foundation for anchoring a model's behavior within these contexts. Our approach leverages techniques from knowledge representation and reasoning, such as ontologies, semantic web technologies, and logic-based formalisms. We evaluate our framework on real-world textual datasets, demonstrating its effectiveness in improving model performance, fairness, and alignment with human expectations, while maintaining high accuracy. Furthermore, we discuss the other key components of the framework, including context-aware encoding, context-aware learning, interpretability and explainability, and continuous monitoring and adaptation. This research contributes to the growing body of work on responsible AI, offering a practical approach to developing more reliable, trustworthy, and ethically-aligned language models. Our findings have significant implications for the deployment of LLMs in sensitive domains such as healthcare, legal systems, and social services, where contextual understanding is paramount. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#811
-
Tan 2022
TEGTOK: Augmenting Text Generation via Task-specific and Open-world Knowledge
60th Annual Meeting of the Association-for-Computational-Linguistics (ACL) 2022;():1597-1609 Dublin, IRELAND Assoc Computational Linguistics-Acl 2022 Ref ID: 3395 Generating natural and informative texts has been a long-standing problem in NLP. Much effort has been dedicated into incorporating pre-trained language models (PLMs) with various open-world knowledge, such as knowledge graphs or wiki pages. However, their ability to access and manipulate the task-specific knowledge is still limited on downstream tasks, as this type of knowledge is usually not well covered in PLMs and is hard to acquire. To address the problem, we propose augmenting TExt Generation via Task-specific and Open-world Knowledge ( TEGTOK) in a unified framework. Our model selects knowledge entries from two types of knowledge sources through dense retrieval and then injects them into the input encoding and output decoding stages respectively on the basis of PLMs. With the help of these two types of knowledge, our model can learn what and how to generate. Experiments on two text generation tasks of dialogue generation and question generation, and on two datasets show that our method achieves better performance than various baseline models. |
Davis
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1887
-
Tan 2024
Small Models, Big Insights: Leveraging Slim Proxy Models to Decide When and What to Retrieve for LLMs
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;1():4420-4436 Association for Computational Linguistics (ACL) 2024 Ref ID: 4348 The integration of large language models (LLMs) and search engines represents a significant evolution in knowledge acquisition methodologies. However, determining the knowledge that an LLM already possesses and the knowledge that requires the help of a search engine remains an unresolved issue. Most existing methods solve this problem through the results of preliminary answers or reasoning done by the LLM itself, but this incurs excessively high computational costs. This paper introduces a novel collaborative approach, namely SlimPLM, that detects missing knowledge in LLMs with a slim proxy model, to enhance the LLM's knowledge acquisition process. We employ a proxy model which has far fewer parameters, and take its answers as heuristic answers. Heuristic answers are then utilized to predict the knowledge required to answer the user question, as well as the known and unknown knowledge within the LLM. We only conduct retrieval for the missing knowledge in questions that the LLM does not know. Extensive experimental results on five datasets with two LLMs demonstrate a notable improvement in the end-to-end performance of LLMs in question-answering tasks, achieving or surpassing current state-of-the-art models with lower LLM inference costs. © 2024 Association for Computational Linguistics. |
yuexi
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#363
-
Tan 2023
Incorporating entity-level knowledge in pretrained language model for biomedical dense retrieval
In recent years, pre-trained language models (PLMs) have dominated natural language processing (NLP) and achieved outstanding performance in various NLP tasks, including dense retrieval based on PLMs. However, in the biomedical domain, the effectiveness of dense retrieval models based on PLMs still needs to be improved due to the diversity and ambiguity of entity expressions caused by the enrichment of biomedical entities. To alleviate the semantic gap, in this paper, we propose a method that incorporates external knowledge at the entity level into a dense retrieval model to enrich the dense representations of queries and documents. Specifically, we first add additional self-attention and information interaction modules in the Transformer layer of the BERT archi-tecture to perform fusion and interaction between query/document text and entity embeddings from knowledge graphs. We then propose an entity similarity loss to constrain the model to better learn external knowledge from entity embeddings, and further propose a weighted entity concatenation mechanism to balance the impact of entity representations when matching queries and documents. Experiments on two publicly available biomedical retrieval datasets show that our proposed method outperforms state-of-the-art dense retrieval methods. In term of NDCG metrics, the proposed method (called ELK) improves the ranking performance of coCondenser by at least 5% on both two datasets, and also obtains further performance gain over state-of-the-art EVA methods. Though having a more sophisticated architecture, the average query latency of ELK is still within the same order of magnitude as that of other efficient methods. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3770
-
Tan 2019
Positional Attention-based Frame Identification with BERT: A Deep Learning Approach to Target Disambiguation and Semantic Frame Selection
arXiv 2019;(): 2019 Ref ID: 7380 Semantic parsing is the task of transforming sentences from natural language into formal representations of predicate-argument structures. Under this research area, frame-semantic parsing has attracted much interest. This parsing approach leverages the lexical information defined in FrameNet to associate marked predicates or targets with semantic frames, thereby assigning semantic roles to sentence components based on pre-specified frame elements in FrameNet. In this paper, a deep neural network architecture known as Positional Attention-based Frame Identification with BERT (PAFIBERT) is presented as a solution to the frame identification subtask in frame-semantic parsing. Although the importance of this subtask is well-established, prior research has yet to find a robust solution that works satisfactorily for both in-domain and out-of-domain data. This study thus set out to improve frame identification in light of recent advancements of language modeling and transfer learning in natural language processing. The proposed method is partially empowered by BERT, a pre-trained language model that excels at capturing contextual information in texts. By combining the language representation power of BERT with a position-based attention mechanism, PAFIBERT is able to attend to target-specific contexts in sentences for disambiguating targets and associating them with the most suitable semantic frames. Under various experimental settings, PAFIBERT outperformed existing solutions by a significant margin, achieving new state-of-the-art results for both in-domain and out-of-domain benchmark test sets. |
Mike
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1540
-
Tanaka 2024
KnowledgeHub: An End-to-End Tool for Assisted Scientific Discovery
IJCAI International Joint Conference on Artificial Intelligence 2024;():8815-8819 International Joint Conferences on Artificial Intelligence 2024 Ref ID: 4378 This paper describes the KnowledgeHub tool, a scientific literature Information Extraction (IE) and Question Answering (QA) pipeline. This is achieved by supporting the ingestion of PDF documents that are converted to text and structured representations. An ontology can then be constructed where a user defines the types of entities and relationships they want to capture. A browser-based annotation tool enables annotating the contents of the PDF documents according to the ontology. Named Entity Recognition (NER) and Relation Classification (RC) models can be trained on the resulting annotations and can be used to annotate the unannotated portion of the documents. A knowledge graph is constructed from these entity and relation triples which can be queried to obtain insights from the data. Furthermore, we integrate a suite of Large Language Models (LLMs) that can be used for QA and summarisation that is grounded in the included documents via a retrieval component. KnowledgeHub is a unique tool that supports annotation, IE and QA, which gives the user full insight into the knowledge discovery pipeline. © 2024 International Joint Conferences on Artificial Intelligence. All rights reserved. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#756
-
Tang 2024
Semantic-aware entity alignment for low resource language knowledge graph
Entity alignment (EA) is an important technique aiming to find the same real entity between two different source knowledge graphs (KGs). Current methods typically learn the embedding of entities for EA from the structure of KGs for EA. Most EA models are designed for rich-resource languages, requiring sufficient resources such as a parallel corpus and pre-trained language models. However, low-resource language KGs have received less attention, and current models demonstrate poor performance on those low-resource KGs. Recently, researchers have fused relation information and attributes for entity representations to enhance the entity alignment performance, but the relation semantics are often ignored. To address these issues, we propose a novel Semantic-aware Graph Neural Network (SGNN) for entity alignment. First, we generate pseudo sentences according to the relation triples and produce representations using pre-trained models. Second, our approach explores semantic information from the connected relations by a graph neural network. Our model captures expanded feature information from KGs. Experimental results using three low-resource languages demonstrate that our proposed SGNN approach out performs better than state-of-the-art alignment methods on three proposed datasets and three public datasets. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3483
-
Tang 2024
GraphArena: Benchmarking Large Language Models on Graph Computational Problems
arXiv 2024;(): 2024 Ref ID: 8434 The "arms race" of Large Language Models (LLMs) demands novel, challenging, and diverse benchmarks to faithfully examine their progresses. We introduce GraphArena, a benchmarking tool designed to evaluate LLMs on graph computational problems using million-scale real-world graphs from diverse scenarios such as knowledge graphs, social networks, and molecular structures. GraphArena offers a suite of 10 computational tasks, encompassing four polynomial-time (e.g., Shortest Distance) and six NP-complete challenges (e.g., Travelling Salesman Problem). It features a rigorous evaluation framework that classifies LLM outputs as correct, suboptimal (feasible but not optimal), or hallucinatory (properly formatted but infeasible). Evaluation of 10 leading LLMs, including GPT-4o and LLaMA3-70B-Instruct, reveals that even top-performing models struggle with larger, more complex graph problems and exhibit hallucination issues. Despite the application of strategies such as chain-of-thought prompting, these issues remain unresolved. GraphArena contributes a valuable supplement to the existing LLM benchmarks and is open-sourced at https://github.com/squareRoot3/GraphArena. |
Kwesi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3263
-
Tao 2024
Clue-Guided Path Exploration: Optimizing Knowledge Graph Retrieval with Large Language Models to Address the Information Black Box Challenge
arXiv 2024;(): 2024 Ref ID: 8045 In recent times, large language models (LLMs) have showcased remarkable capabilities. However, updating their knowledge poses challenges, potentially leading to inaccuracies when confronted with unfamiliar queries. To address this issue, integrating external knowledge bases such as knowledge graphs with large language models is a viable approach. The key challenge lies in extracting the required knowledge from knowledge graphs based on natural language, demanding high semantic understanding. Therefore, researchers are considering leveraging large language models directly for knowledge retrieval from these graphs. Current efforts typically rely on the comprehensive problem-solving capabilities of large language models. We argue that a problem we term the 'information black box' can significantly impact the practical effectiveness of such methods. Moreover, this kind of methods is less effective for scenarios where the questions are unfamiliar to the large language models. In this paper, we propose a Clue-Guided Path Exploration (CGPE) framework to optimize knowledge retrieval based on large language models. By addressing the 'information black box' issue and employing single-task approaches instead of complex tasks, we have enhanced the accuracy and efficiency of using large language models for retrieving knowledge graphs. Experiments on open-source datasets reveal that CGPE outperforms previous methods and is highly applicable to LLMs with fewer parameters. In some instances, even ChatGLM3, with its 6 billion parameters, can rival the performance of GPT-4. Furthermore, the results indicate a minimal invocation frequency of CGPE on LLMs, suggesting reduced computational overhead. For organizations and individuals facing constraints in computational resources, our research offers significant practical value. |
Kwesi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#318
-
Taunk 2023
GrapeQA: GRaph Augmentation and Pruning to Enhance Question-Answering
32nd World Wide Web Conference (WWW) 2023;():1138-1144 Austin, TX Assoc Computing Machinery 2023 DOI: 10.1145/3543873.3587651 · Ref ID: 3409 Commonsense question-answering (QA) methods combine the power of pre-trained Language Models (LM) with the reasoning provided by Knowledge Graphs (KG). A typical approach collects nodes relevant to the QA pair from a KG to form a Working Graph (WG) followed by reasoning using Graph Neural Networks (GNNs). This faces two major challenges: (i) it is difficult to capture all the information from the QA in the WG, and (ii) the WG contains some irrelevant nodes from the KG. To address these, we propose GrapeQA with two simple improvements on the WG: (i) Prominent Entities for Graph Augmentation identifies relevant text chunks from the QA pair and augments the WG with corresponding latent representations from the LM, and (ii) Context-Aware Node Pruning removes nodes that are less relevant to the QA pair. We evaluate our results on OpenBookQA, CommonsenseQA and MedQA-USMLE and see that GrapeQA shows consistent improvements over its LM + KG predecessor (QA-GNN in particular) and large improvements on OpenBookQA. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3598
-
Teneva 2023
Knowledge Graphs are not Created Equal: Exploring the Properties and Structure of Real KGs
arXiv 2023;(): 2023 Ref ID: 7932 Despite the recent popularity of knowledge graph (KG) related tasks and benchmarks such as KG embeddings, link prediction, entity alignment and evaluation of the reasoning abilities of pretrained language models as KGs, the structure and properties of real KGs are not well studied. In this paper, we perform a large scale comparative study of 29 real KG datasets from diverse domains such as the natural sciences, medicine, and NLP to analyze their properties and structural patterns. Based on our findings, we make several recommendations regarding KG-based model development and evaluation. We believe that the rich structural information contained in KGs can benefit the development of better KG models across fields and we hope this study will contribute to breaking the existing data silos between different areas of research (e.g., ML, NLP, AI for sciences). |
Kwesi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#245
-
Terron 2023
Event Extraction and Semantic Representation from Spanish Workers' Statute Using Large Language Models
36th Annual International Conference on Legal Knowledge and Information Systems (JURIX) 2023;379():329-334 Maastricht Univ, Maastricht, NETHERLANDS Ios Press 2023 DOI: 10.3233/faia230983 · Ref ID: 3169 This work uses Large Language Models to process an important piece of Spanish legislation: the Workers' Statute. The proposed method extracts the relevant events in its articles using a GPT-3.5 model and represents the entities involved in the events and the relationships between them as RDF triples. The experiments carried out to select a high-performance strategy include both zero- and few-shot learning tests. Finally, this work proposes a strategy to uplift the extracted legal relations into a legal knowledge graph. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#775
-
Thai 2021
Simultaneously Self-Attending to Text and Entities for Knowledge-Informed Text Representations
Joint Conference of 59th Annual Meeting of the Association-for-Computational-Linguistics (ACL) / 11th International Joint Conference on Natural Language Processing (IJCNLP) / 6th Workshop on Representation Learning for NLP (RepL4NLP) 2021;():241-247 Electr Network Assoc Computational Linguistics-Acl 2021 Ref ID: 3407 Pre-trained language models have emerged as highly successful methods for learning good text representations. However, the amount of structured knowledge retained in such models, and how (if at all) it can be extracted, remains an open question. In this work, we aim at directly learning text representations which leverage structured knowledge about entities mentioned in the text. This can be particularly beneficial for downstream tasks which are knowledge-intensive. Our approach utilizes self-attention between words in the text and knowledge graph (KG) entities mentioned in the text. While existing methods require entity-linked data for pre-training, we train using a mention-span masking objective and a candidate ranking objective - which doesn't require any entity-links and only assumes access to an alias table for retrieving candidates, enabling large-scale pre-training. We show that the proposed model learns knowledge-informed text representations that yield improvements on the downstream tasks over existing methods. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2570
-
Thakur 2012
Information extraction from semi-structured and un-structured documents using probabilistic context free grammar inference
2012 International Conference on Information Retrieval & Knowledge Management 2012;():273-276 2012 DOI: 10.1109/InfRKM.2012.6204988 · Ref ID: 6240 Large number of research papers are available in the form of un-structured (text) format. Knowledge discovery in un-structured document has been recognized as promising task. These documents are typically formatted for human viewing, which varies widely from document to document. Frequent change in their formatting causes difficulties in constructing a global schema. Thus, discovery of interesting rules from it is a complex and tedious process. Recently, conditional random fields (CRFs) and hand-coded wrappers have been used to label the text (such as Title, Author Name(s), Affiliation, Email, Contact number, etc. in research papers). In this paper we propose a novel hybrid approach to infer grammar rules using alignment similarity and probabilistic context free grammar. It helps in extracting desired information from the document. |
Mike
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1020
-
Thant 2023
BERT Fine-Tuning the Covid-19 Open Research Dataset for Named Entity Recognition
Communications in Computer and Information Science 2023;1942 CCIS():261-275 Springer Science and Business Media Deutschland GmbH 2023 DOI: 10.1007/978-981-99-7969-1_19 · Ref ID: 5060 This study employs the widely used Large Language Model (LLM), BERT, to implement Named Entity Recognition (NER) on the CORD-19 biomedical literature corpus. By fine-tuning the pre-trained BERT on the CORD-NER dataset, the model gains the ability to comprehend the context and semantics of biomedical named entities. The refined model is then utilized on the CORD-19 to extract more contextually relevant and updated named entities. However, fine-tuning large datasets with LLMs poses a challenge. To counter this, two distinct sampling methodologies are proposed to apply on each dataset. First, for the NER task on the CORD-19, a Latent Dirichlet Allocation (LDA) topic modeling technique is employed. This maintains the sentence structure while concentrating on related content. Second, a straightforward greedy method is deployed to gather the most informative data of 25 entity types from the CORD-NER dataset. The study realizes its goals by demonstrating the content comprehension capability of BERT-based models without the necessity of supercomputers, and converting the document-level corpus into a source for NER data, enhancing data accessibility. The outcomes of this research can shed light on the potential progression of more sophisticated NLP applications across various sectors, including knowledge graph creation, ontology learning, and conversational AI. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd 2023. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1748
-
Thießen 2023
Probing Large Language Models for Scientific Synonyms
CEUR Workshop Proceedings 2023;3510(): CEUR-WS 2023 Ref ID: 5198 Purpose: Automatically identifying synonyms is an important but challenging aspect of entity normalization in knowledge graphs. Entity normalization is crucial in ensuring that information in knowledge graphs is well connected and therefore efficiently reusable. We aim to investigate the potential of pre-trained large language models (LLMs) for this task. Methodology: We use k-Means clustering to compare latent concepts learned by LLMs with human-defined scientific synonymy concept clusters sourced from ORKG, CS-KG, SemEval 2017, and SciERC data. We investigate the models BERT, RoBERTa, BART, and OpenAI GPT3 (text-embedding-ada-002 variant) and evaluate clustering results by model layer. Findings: F1 scores average around 0.7 to 0.75 depending on the dataset and layer. The best results are reached using OpenAI GPT3 (max F1=0.914). We further notice no advantage of models trained on scientific data. Value: Our results suggest information learned by transformer models aligns with human-defined scientific synonyms. This shows the potential of information encoded in pre-trained LLMs to be leveraged for synonymy detection. © 2023 Copyright for this paper by its authors. |
Xinchen
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1468
-
Tian 2024
KG-Adapter: Enabling Knowledge Graph Integration in Large Language Models through Parameter-Efficient Fine-Tuning
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():3813-3828 Association for Computational Linguistics (ACL) 2024 Ref ID: 4259 Although large language models (LLMs) show remarkable capabilities and generalizability across various tasks, they are criticized for lack of expertise. One promising solution is to combine knowledge graphs (KGs) with LLMs, and recent studies focus on integrating KGs into LLMs through prompt-based methods. However, these approaches fail to use the structural information of the KGs, suffer from the problem of knowledge conflict, and over-reliance on super LLMs. To address these challenges, we propose KG-Adapter, a parameter-level KG integration method based on parameter-efficient fine-tuning (PEFT). Specifically, we introduce a novel adapter structure designed for decoder-only LLMs, which can encode KGs from both node-centered and relation-centered perspectives, and then perform joint reasoning with LLMs to generate responses end-to-end. Experiments with diverse models on four datasets for two different tasks all demonstrate significant improvements. With only 28M parameters trained, we make the 7B-parameter LLM outperform the previous full-parameter fine-tuned state-of-the-art method and comparable to the prompt-based ChatGPT methods. © 2024 Association for Computational Linguistics. |
Kwesi
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#648
-
Tian 2024
PDEC: A Framework for Improving Knowledge Graph Reasoning Performance through Predicate Decomposition
The judicious configuration of predicates is a crucial but often overlooked aspect in the field of knowledge graphs. While previous research has primarily focused on the precision of triples in assessing knowledge graph quality, the rationality of predicates has been largely ignored. This paper introduces an innovative approach aimed at enhancing knowledge graph reasoning by addressing the issue of predicate polysemy. Predicate polysemy refers to instances where a predicate possesses multiple meanings, introducing ambiguity into the knowledge graph. We present an adaptable optimization framework that effectively addresses predicate polysemy, thereby enhancing reasoning capabilities within knowledge graphs. Our approach serves as a versatile and generalized framework applicable to any reasoning model, offering a scalable and flexible solution to enhance performance across various domains and applications. Through rigorous experimental evaluations, we demonstrate the effectiveness and adaptability of our methodology, showing significant improvements in knowledge graph reasoning accuracy. Our findings underscore that discerning predicate polysemy is a crucial step towards achieving a more dependable and efficient knowledge graph reasoning process. Even in the age of large language models, the optimization and induction of predicates remain relevant in ensuring interpretable reasoning. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2598
-
Tiwari 2003
Invisible formal methods for embedded control systems
Embedded control systems typically comprise continuous control laws combined with discrete mode logic. These systems are modeled using a hybrid automaton formalism, which is obtained by combining the discrete transition system formalism with continuous dynamical systems. This paper develops automated analysis techniques for asserting correctness of hybrid system designs. Our approach is based on symbolic representation of the state space of the system using mathematical formulas in an appropriate logic. Such formulas are manipulated using symbolic theorem proving techniques. It is important that formal analysis should be unobtrusive and acceptable to engineering practice. We motivate a methodology called invisible formal methods that provides a graded sequence of formal analysis technologies ranging from extended typechecking, through approximation and abstraction, to model checking and theorem proving. As an instance of invisible formal methods, we describe techniques to check inductive invariants, or extended types, for hybrid systems and compute discrete finite state abstractions automatically to perform reachability set computation. The abstract system is sound with respect to the formal semantics of hybrid automata. We also discuss techniques for performing analysis on nonstandard semantics of hybrid automata. We also briefly discuss the problem of translating models in Simulink/Stateflow language, which is widely used in practice, into the modeling formalisms, like hybrid automata, for which analysis tools are being developed. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2890
-
Todoran 2015
Semantic investigation of a control-flow subset of BPMN 2.0
2015 IEEE International Conference on Intelligent Computer Communication and Processing (ICCP) 2015;():483-490 2015 DOI: 10.1109/ICCP.2015.7312707 · Ref ID: 6876 Business Process Model and Notation (BPMN), now at version 2.0.2, provides a standard graphical representation for specifying business processes. In this paper we report on the first stage of a semantic investigation of BPMN, using methods in the tradition of programming languages semantics. We consider a control-flow subset of BPMN and an execution architecture based on an intermediate language that we name ℒBPMN. The execution architecture comprises two main components: a translator which takes as input a BPMN model and generates ℒBPMN code, and an interpreter for ℒBPMN. ℒBPMN is a process oriented imperative language providing a combination of concepts, including maximal parallelism and durational activities. We employ the mathematical methodology of metric semantics in designing and relating an operational semantics O and a denotational semantics D for ℒBPMN. We establish the formal relation between O and D by using an abstraction operator and a fixed point argument. In this way we prove the correctness of the denotational semantics with respect to the operational semantics. We focus on the semantic investigation of BPMN. We also explain how the operational semantics can serve as a blueprint for an implementation on a client-server architecture. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#370
-
Toledo 2019
Information extraction from historical handwritten document images with a context-aware neural model
Many historical manuscripts that hold trustworthy memories of the past societies contain information organized in a structured layout (e.g. census, birth or marriage records). The precious information stored in these documents cannot be effectively used nor accessed without costly annotation efforts. The transcription driven by the semantic categories of words is crucial for the subsequent access. In this paper we describe an approach to extract information from structured historical handwritten text images and build a knowledge representation for the extraction of meaning out of historical data. The method extracts information, such as named entities, without the need of an intermediate transcription step, thanks to the incorporation of context information through language models. Our system has two variants, the first one is based on bigrams, whereas the second one is based on recurrent neural networks. Concretely, our second architecture integrates a Convolutional Neural Network to model visual information from word images together with a Bidirecitonal Long Short Term Memory network to model the relation among the words. This integrated sequential approach is able to extract more information than just the semantic category (e.g. a semantic category can be associated to a person in a record). Our system is generic, it deals with out-of-vocabulary words by design, and it can be applied to structured handwritten texts from different domains. The method has been validated with the ICDAR IEHHR competition protocol, outper-forming the existing approaches. (C) 2018 Elsevier Ltd. All rights reserved. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#53
-
Tong 2024
Automating psychological hypothesis generation with AI: when large language models meet causal graph
Leveraging the synergy between causal knowledge graphs and a large language model (LLM), our study introduces a groundbreaking approach for computational hypothesis generation in psychology. We analyzed 43,312 psychology articles using a LLM to extract causal relation pairs. This analysis produced a specialized causal graph for psychology. Applying link prediction algorithms, we generated 130 potential psychological hypotheses focusing on "well-being", then compared them against research ideas conceived by doctoral scholars and those produced solely by the LLM. Interestingly, our combined approach of a LLM and causal graphs mirrored the expert-level insights in terms of novelty, clearly surpassing the LLM-only hypotheses (t(59) = 3.34, p = 0.007 and t(59) = 4.32, p < 0.001, respectively). This alignment was further corroborated using deep semantic analysis. Our results show that combining LLM with machine learning techniques such as causal knowledge graphs can revolutionize automated discovery in psychology, extracting novel insights from the extensive literature. This work stands at the crossroads of psychology and artificial intelligence, championing a new enriched paradigm for data-driven hypothesis generation in psychological research. |
yuexi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3969
-
Tong 2024
Utilizing Large Language Models for Named Entity Recognition in Traditional Chinese Medicine against COVID-19 Literature: Comparative Study
arXiv 2024;(): 2024 Ref ID: 8559 Objective: To explore and compare the performance of ChatGPT and other state-of-the-art LLMs on domain-specific NER tasks covering different entity types and domains in TCM against COVID-19 literature. Methods: We established a dataset of 389 articles on TCM against COVID-19, and manually annotated 48 of them with 6 types of entities belonging to 3 domains as the ground truth, against which the NER performance of LLMs can be assessed. We then performed NER tasks for the 6 entity types using ChatGPT (GPT-3.5 and GPT-4) and 4 state-of-the-art BERT-based question-answering (QA) models (RoBERTa, MiniLM, PubMedBERT and SciBERT) without prior training on the specific task. A domain fine-tuned model (GSAP-NER) was also applied for a comprehensive comparison. Results: The overall performance of LLMs varied significantly in exact match and fuzzy match. In the fuzzy match, ChatGPT surpassed BERT-based QA models in 5 out of 6 tasks, while in exact match, BERT-based QA models outperformed ChatGPT in 5 out of 6 tasks but with a smaller F-1 difference. GPT-4 showed a significant advantage over other models in fuzzy match, especially on the entity type of TCM formula and the Chinese patent drug (TFD) and ingredient (IG). Although GPT-4 outperformed BERT-based models on entity type of herb, target, and research method, none of the F-1 scores exceeded 0.5. GSAP-NER, outperformed GPT-4 in terms of F-1 by a slight margin on RM. ChatGPT achieved considerably higher recalls than precisions, particularly in the fuzzy match. Conclusions: The NER performance of LLMs is highly dependent on the entity type, and their performance varies across application scenarios. ChatGPT could be a good choice for scenarios where high recall is favored. However, for knowledge acquisition in rigorous scenarios, neither ChatGPT nor BERT-based QA models are off-the-shelf tools for professional practitioners. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2060
-
Toro 2024
Dynamic Retrieval Augmented Generation of Ontologies using Artificial Intelligence (DRAGON-AI)
BACKGROUND: Ontologies are fundamental components of informatics infrastructure in domains such as biomedical, environmental, and food sciences, representing consensus knowledge in an accurate and computable form. However, their construction and maintenance demand substantial resources and necessitate substantial collaboration between domain experts, curators, and ontology experts. We present Dynamic Retrieval Augmented Generation of Ontologies using AI (DRAGON-AI), an ontology generation method employing Large Language Models (LLMs) and Retrieval Augmented Generation (RAG). DRAGON-AI can generate textual and logical ontology components, drawing from existing knowledge in multiple ontologies and unstructured text sources. RESULTS: We assessed performance of DRAGON-AI on de novo term construction across ten diverse ontologies, making use of extensive manual evaluation of results. Our method has high precision for relationship generation, but has slightly lower precision than from logic-based reasoning. Our method is also able to generate definitions deemed acceptable by expert evaluators, but these scored worse than human-authored definitions. Notably, evaluators with the highest level of confidence in a domain were better able to discern flaws in AI-generated definitions. We also demonstrated the ability of DRAGON-AI to incorporate natural language instructions in the form of GitHub issues. CONCLUSIONS: These findings suggest DRAGON-AI's potential to substantially aid the manual ontology construction process. However, our results also underscore the importance of having expert curators and ontology editors drive the ontology generation process. |
Ishan
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3381
-
Trajanoska 2023
Enhancing Knowledge Graph Construction Using Large Language Models
arXiv 2023;(): 2023 Ref ID: 7694 The growing trend of Large Language Models (LLM) development has attracted significant attention, with models for various applications emerging consistently. However, the combined application of Large Language Models with semantic technologies for reasoning and inference is still a challenging task. This paper analyzes how the current advances in foundational LLM, like ChatGPT, can be compared with the specialized pretrained models, like REBEL, for joint entity and relation extraction. To evaluate this approach, we conducted several experiments using sustainability-related text as our use case. We created pipelines for the automatic creation of Knowledge Graphs from raw texts, and our findings indicate that using advanced LLM models can improve the accuracy of the process of creating these graphs from unstructured text. Furthermore, we explored the potential of automatic ontology creation using foundation LLM models, which resulted in even more relevant and accurate knowledge graphs. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3160
-
Tran 2021
SPBERT: an Efficient Pre-training BERT on SPARQL Queries for Question Answering over Knowledge Graphs
Neural Information Processing: 28th International Conference, ICONIP 2021, Sanur, Bali, Indonesia, December 8–12, 2021, Proceedings, Part I 2021;():512–523 Sanur, Bali, Indonesia Springer-Verlag 2021 DOI: 10.1007/978-3-030-92185-9_42 · Ref ID: 7319 |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1257
-
Tran 2024
Enhancing Knowledge Retrieval with Topic Modeling for Knowledge-Grounded Dialogue
2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():5986-5995 European Language Resources Association (ELRA) 2024 Ref ID: 4586 Knowledge retrieval is one of the major challenges in building a knowledge-grounded dialogue system. A common method is to use a neural retriever with a distributed approximate nearest-neighbor database to quickly find the relevant knowledge sentences. In this work, we propose an approach that utilizes topic modeling on the knowledge base to further improve retrieval accuracy and as a result, improve response generation. Additionally, we experiment with a large language model, ChatGPT, to take advantage of the improved retrieval performance to further improve the generation results. Experimental results on two datasets show that our approach can increase retrieval and generation performance. The results also indicate that ChatGPT is a better response generator for knowledge-grounded dialogue when relevant knowledge is provided. © 2024 ELRA Language Resource Association: CC BY-NC 4.0. |
Kwesi
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#866
-
Trappey 2022
Using Machine Learning Language Models to Generate Innovation Knowledge Graphs for Patent Mining
To explore and understand the state-of-the-art innovations in any given domain, researchers often need to study many domain patents and synthesize their knowledge content. This study provides a smart patent knowledge graph generation system, adopting a machine learning (ML) natural language modeling approach, to help researchers grasp the patent knowledge by generating deep knowledge graphs. This research focuses on converting chemical utility patents, consisting of chemistries and chemical processes, into summarized knowledge graphs. The research methods are in two parts, i.e., the visualization of the chemical processes in the chemical patents' most relevant paragraphs and a knowledge graph of any domain-specific collection of patent texts. The ML language modeling algorithms, including ALBERT for text vectorization, Sentence-BERT for sentence classification, and KeyBERT for keyword extraction, are adopted. These models are trained and tested in the case study using 879 chemical patents in the carbon capture domain. The results demonstrate that the average retention rate of the summary graphs for five clustered patent texts exceeds 80%. The proposed approach is novel and proven to be reliable in graphical deep knowledge representation. |
Ishan
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3265
-
Trofimova 2024
CodeRefine: A Pipeline for Enhancing LLM-Generated Code Implementations of Research Papers
arXiv 2024;(): 2024 Ref ID: 8557 This paper presents CodeRefine, a novel framework for automatically transforming research paper methodologies into functional code using Large Language Models (LLMs). Our multi-step approach first extracts and summarizes key text chunks from papers, analyzes their code relevance, and creates a knowledge graph using a predefined ontology. Code is then generated from this structured representation and enhanced through a proposed retrospective retrieval-augmented generation approach. CodeRefine addresses the challenge of bridging theoretical research and practical implementation, offering a more accurate alternative to LLM zero-shot prompting. Evaluations on diverse scientific papers demonstrate CodeRefine's ability to improve code implementation from the paper, potentially accelerating the adoption of cutting-edge algorithms in real-world applications. |
Davis
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1266
-
Tsaneva 2024
Enhancing Scientific Knowledge Graph Generation Pipelines with LLMs and Human-in-the-Loop
CEUR Workshop Proceedings 2024;3780(): CEUR-WS 2024 Ref ID: 4121 Scientific Knowledge Graphs have recently become a powerful tool for exploring the research landscape and assisting scientific inquiry. It is crucial to generate and validate these resources to ensure they offer a comprehensive and accurate representation of specific research fields. However, manual approaches are not scalable, while automated methods often result in lower-quality resources. In this paper, we investigate novel validation techniques to improve the accuracy of automated KG generation methodologies, leveraging both a human-in-the-loop (HiL) and a large language model (LLM)-in-the-loop. Using the automated generation pipeline of the Computer Science Knowledge Graph as a case study, we demonstrate that precision can be increased by 12% (from 75% to 87%) using only LLMs. Moreover, a hybrid approach incorporating both LLMs and HiL significantly enhances both precision and recall, resulting in a 4% increase in the F1 score (from 77% to 81%). © 2022 Copyright for this paper by its authors. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1612
-
Tsaneva 2024
LLM-driven Ontology Evaluation: Verifying Ontology Restrictions with ChatGPT
CEUR Workshop Proceedings 2024;3747():15 CEUR-WS 2024 Ref ID: 4382 Recent advancements in artificial intelligence, particularly in large language models (LLMs), have sparked interest in their application to knowledge engineering (KE) tasks. While existing research has primarily explored the utilisation of LLMs for constructing and completing semantic resources such as ontologies and knowledge graphs, the evaluation of these resources-addressing quality issues- has not yet been thoroughly investigated. To address this gap, we propose an LLM-driven approach for the verification of ontology restrictions. We replicate our previously conducted human-in-the-loop experiment using ChatGPT-4 instead of human contributors to assess whether comparable ontology verification results can be obtained. We find that (1) ChatGPT-4 achieves intermediate-to-expert scores on an ontology modelling qualification test; (2) the model performs ontology restriction verification with accuracy of 92.22%; (3) combining model answers on the same ontology axiom represented in different formalisms improves the accuracy to 96.67%; and (4) higher accuracy is observed in identifying defects related to the incompleteness of ontology axioms compared to errors due to restrictions misuse. Our results highlight the potential of LLMs in supporting knowledge engineering tasks and outline future research directions in the area. © 2024 Copyright for this paper by its authors. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3331
-
Tu 2024
DICE: Detecting In-distribution Contamination in LLM's Fine-tuning Phase for Math Reasoning
arXiv 2024;(): 2024 Ref ID: 8361 The advancement of large language models (LLMs) relies on evaluation using public benchmarks, but data contamination can lead to overestimated performance. Previous researches focus on detecting contamination by determining whether the model has seen the exact same data during training. Besides, prior work has already shown that even training on data similar to benchmark data inflates performance, namely \emph{In-distribution contamination}. In this work, we argue that in-distribution contamination can lead to the performance drop on OOD benchmarks. To effectively detect in-distribution contamination, we propose DICE, a novel method that leverages the internal states of LLMs to locate-then-detect the contamination. DICE first identifies the most sensitive layer to contamination, then trains a classifier based on the internal states of that layer. Experiments reveal DICE's high accuracy in detecting in-distribution contamination across various LLMs and math reasoning datasets. We also show the generalization capability of the trained DICE detector, which is able to detect contamination across multiple benchmarks with similar distributions. Additionally, we find that DICE's predictions correlate with the performance of LLMs fine-tuned by either us or other organizations, achieving a coefficient of determination ($R^2$) between 0.61 and 0.75. The code and data are available at https://github.com/THU-KEG/DICE. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3660
-
Tulchinskii 2024
Listening to the Wise Few: Select-and-Copy Attention Heads for Multiple-Choice QA
arXiv 2024;(): 2024 Ref ID: 8653 A standard way to evaluate the abilities of LLM involves presenting a multiple-choice question and selecting the option with the highest logit as the model's predicted answer. However, such a format for evaluating LLMs has limitations, since even if the model knows the correct answer, it may struggle to select the corresponding letter simply due to difficulties in following this rigid format. To address this, we introduce new scores that better capture and reveal model's underlying knowledge: the Query-Key Score (QK-score), derived from the interaction between query and key representations in attention heads, and the Attention Score, based on attention weights. These scores are extracted from specific ??????-???-???? heads, which show consistent performance across popular Multi-Choice Question Answering (MCQA) datasets. Based on these scores, our method improves knowledge extraction, yielding up to 16% gain for LLaMA2-7B and up to 10% for larger models on popular MCQA benchmarks. At the same time, the accuracy on a simple synthetic dataset, where the model explicitly knows the right answer, increases by almost 60%, achieving nearly perfect accuracy, therefore demonstrating the method's efficiency in mitigating MCQA format limitations. To support our claims, we conduct experiments on models ranging from 7 billion to 70 billion parameters in both zero- and few-shot setups. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1654
-
Tuozzo 2024
Moving from Tabular Knowledge Graph Quality Assessment to RDF Triples Leveraging ChatGPT
CEUR Workshop Proceedings 2024;3747():9 CEUR-WS 2024 Ref ID: 4343 Data quality assessment is a multifaceted challenge involving various dimensions such as accessibility, interlinking, and completeness. These dimensions are domain-dependent and can be aggregated into a score between 0 and 1, facilitating dataset ranking based on quality. Achieving effective representation and explanation of these rankings poses significant challenges akin to those in machine learning, where interpretability and understandability are crucial. In the domain of natural language processing, data interpretation is a critical yet complex process, often requiring domain expertise and significant resources. Advanced Language Model Models (LLMs) offer promise in automating annotation tasks, ensuring consistency, and adapting to specific domains. Leveraging such models for knowledge representation tasks necessitates adept prompt engineering. This study focuses on experiencing state-of-the-art prompt engineering methods, particularly using GPT-3.5, for representing knowledge related to dataset quality. By exploring techniques to extract RDF triples from textual data without predefined labels or constraints, this work aims to enhance interpretability and understanding of dataset quality assessment results while verifying the feasibility on automatic knowledge representation leveraging LLMs. © 2022 Copyright for this paper by its authors. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#833
-
Tupayachi 2024
Towards Next-Generation Urban Decision Support Systems through AI-Powered Construction of Scientific Ontology Using Large Language Models-A Case in Optimizing Intermodal Freight Transportation
Highlights What are the main findings? We have developed an integrated and automated methodology that leverages a pre-trained Large Language Model (LLM) to generate scenario-based ontologies and knowledge graphs from research articles and technical manuals. Our methodology utilizes the ChatGPT API as the primary reasoning engine, supplemented by Natural Language Processing modules and carefully engineered prompts. This combination enables an automated tool capable of generating ontologies independently. The ontologies generated through our AI-powered method are interoperable and can significantly facilitate the design of data models and software architecture, particularly in the development of urban decision support systems. What is the implication of the main finding? We compared ontologies generated by our LLM with those created by human experts through CQ-based qualitative evaluation, assessing the reliability and feasibility of our approach. The methodology has been successfully applied to intermodal freight data and simulations. This has allowed us to generate a scenario-based ontology and knowledge graph that enhances data discovery, integration, and management, thereby supporting network optimization and multiple criteria decision analysis. Our methodology is both generalizable and adaptive, enabling the automation of ontology generation to support the development of urban and environmental decision support systems across various disciplines.Highlights What are the main findings? We have developed an integrated and automated methodology that leverages a pre-trained Large Language Model (LLM) to generate scenario-based ontologies and knowledge graphs from research articles and technical manuals. Our methodology utilizes the ChatGPT API as the primary reasoning engine, supplemented by Natural Language Processing modules and carefully engineered prompts. This combination enables an automated tool capable of generating ontologies independently. The ontologies generated through our AI-powered method are interoperable and can significantly facilitate the design of data models and software architecture, particularly in the development of urban decision support systems. What is the implication of the main finding? We compared ontologies generated by our LLM with those created by human experts through CQ-based qualitative evaluation, assessing the reliability and feasibility of our approach. The methodology has been successfully applied to intermodal freight data and simulations. This has allowed us to generate a scenario-based ontology and knowledge graph that enhances data discovery, integration, and management, thereby supporting network optimization and multiple criteria decision analysis. Our methodology is both generalizable and adaptive, enabling the automation of ontology generation to support the development of urban and environmental decision support systems across various disciplines.Abstract The incorporation of Artificial Intelligence (AI) models into various optimization systems is on the rise. However, addressing complex urban and environmental management challenges often demands deep expertise in domain science and informatics. This expertise is essential for deriving data and simulation-driven insights that support informed decision-making. In this context, we investigate the potential of leveraging the pre-trained Large Language Models (LLMs) to create knowledge representations for supporting operations research. By adopting ChatGPT-4 API as the reasoning core, we outline an applied workflow that encompasses natural language processing, Methontology-based prompt tuning, and Generative Pre-trained Transformer (GPT), to automate the construction of scenario-based ontologies using existing research articles and technical manuals of urban datasets and simulations. From these ontologies, knowledge graphs can be derived using widely adopted formats and protocols, guiding various tasks towards data-informed decision support. The performance of our methodology is evaluated through a comparative analysis that contrasts our AI-generated ontology with the widely recognized pizza ontology, commonly used in tutorials for popular ontology software. We conclude with a real-world case study on optimizing the complex system of multi-modal freight transportation. Our approach advances urban decision support systems by enhancing data and metadata modeling, improving data integration and simulation coupling, and guiding the development of decision support strategies and essential software components. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#548
-
Uma 2023
Masking Language Model Mechanism with Event-Driven Knowledge Graphs for Temporal Relations Extraction from Clinical Narratives
12th International Conference on Complex Networks and their Applications (COMPLEX NETWORKS) 2023;1141():162-174 Menton, FRANCE Springer International Publishing Ag 2023 DOI: 10.1007/978-3-031-53468-3_14 · Ref ID: 3524 For many natural language processing systems, the extraction of temporal links and associations from clinical narratives has been a critical challenge. To understand such processes, we must be aware of the occurrences of events and their time or temporal aspect by constructing a chronology for the sequence of events. The primary objective of temporal relation extraction is to identify relationships and correlations between entities, events, and expressions. We propose a novel architecture leveraging Transformer based graph neural network by combining textual data with event graph embeddings for predicting temporal links across events, entities, document creation time and expressions. We demonstrate our preliminary findings on i2b2 temporal relations corpus for predicting BEFORE, AFTER and OVERLAP links with event graph for correct set of relations. Comparison with various Biomedical-BERT embedding types were benchmarked yielding best performance on PubMed BERT with language model masking (LMM) mechanism on our methodology. This illustrates the effectiveness of our proposed strategy. |
Kwesi
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1706
-
vanCauter 2024
Ontology-guided Knowledge Graph Construction from Maintenance Short Texts
KaLLM 2024 - 1st Workshop on Knowledge Graphs and Large Language Models, Proceedings of the Workshop 2024;():75-84 Association for Computational Linguistics (ACL) 2024 Ref ID: 4315 Large-scale knowledge graph construction remains infeasible since it requires significant human-expert involvement. Further complications arise when building graphs from domain-specific data due to their unique vocabularies and associated contexts. In this work, we demonstrate the ability of open-source large language models (LLMs), such as Llama-2 and Llama-3, to extract facts from domain-specific Maintenance Short Texts (MSTs). We employ an approach which combines ontology-guided triplet extraction and in-context learning. By using only 20 semantically similar examples with the Llama-3-70B-Instruct model, we achieve performance comparable to previous methods that relied on fine-tuning techniques like SpERT and REBEL. This indicates that domain-specific fact extraction can be accomplished through inference alone, requiring minimal labeled data. This opens up possibilities for effective and efficient semi-automated knowledge graph construction for domain-specific data. ©2024 Association for Computational Linguistics. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3853
-
Van 2024
Rx Strategist: Prescription Verification using LLM Agents System
arXiv 2024;(): 2024 Ref ID: 8580 To protect patient safety, modern pharmaceutical complexity demands strict prescription verification. We offer a new approach - Rx Strategist - that makes use of knowledge graphs and different search strategies to enhance the power of Large Language Models (LLMs) inside an agentic framework. This multifaceted technique allows for a multi-stage LLM pipeline and reliable information retrieval from a custom-built active ingredient database. Different facets of prescription verification, such as indication, dose, and possible drug interactions, are covered in each stage of the pipeline. We alleviate the drawbacks of monolithic LLM techniques by spreading reasoning over these stages, improving correctness and reliability while reducing memory demands. Our findings demonstrate that Rx Strategist surpasses many current LLMs, achieving performance comparable to that of a highly experienced clinical pharmacist. In the complicated world of modern medications, this combination of LLMs with organized knowledge and sophisticated search methods presents a viable avenue for reducing prescription errors and enhancing patient outcomes. |
yuexi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2102
-
Varadarajan 2015
Affordance and k-TR Augmented Alphabet based Neuro-Symbolic language — Af-kTRAANS — A Human-Robot Interaction meta-language
2015 20th International Conference on Methods and Models in Automation and Robotics (MMAR) 2015;():394-399 2015 DOI: 10.1109/MMAR.2015.7283908 · Ref ID: 6092 Human-Robot Interaction (HRI) and Inter-Robot Communication (ICI) are rapidly evolving fields with little standardization. A number of middleware architectures, frameworks and programming languages exist for implementing algorithms on robots. Also, efforts have been made to enable robots to understand the multitude of natural languages available. Nevertheless, there is definite lack of intermediary languages for the representation of symbol grounding mechanisms in robots and standards for inter-robot cognitive communication. We address this void by presenting an intermediary meta-language based on a perceptually grounded algorithmic alphabet - the Affordance and kTR Augmented Alphabet based Neuro-Symbolic language, in short Af-kTRAANS, yielding an abstract layer sandwiched between the natural and the programming language layers that robots can use for knowledge representation, sharing and communication, while being agnostic to the embodiment, the pertinent human language, as well as, socio-cultural contexts and environments. Based on the k-TR theory of cognitive visual perception and implemented for practical systems using the Affordance Network (AfNet) and the AfRob ontology, the graphical language can support a wide variety of object definition phrases as well as action verbs/ object interaction commands while providing the necessary succinctness for tractable modeling. The various aspects of this cognitive inter-robot communication language are presented in this paper. Several examples of usage of the graphical language for common robotic task based queries are demonstrated along with the grounding mechanisms. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#427
-
Varshney 2023
Knowledge graph assisted end-to-end medical dialog generation
Medical dialog systems have the potential to assist e-medicine in improving access to healthcare services, improving patient treatment quality, and lowering medical expenses. In this research, we describe a knowledge -grounded conversation generation model that demonstrates how large-scale medical information in the form of knowledge graphs can aid in language comprehension and generation in medical dialog systems. Generic responses are often produced by existing generative dialog systems, resulting in monotonous and uninteresting conversations. To solve this problem, we combine various pre-trained language models with a medical knowledge base (UMLS) to generate clinically correct and human-like medical conversations using the recently released MedDialog-EN dataset. The medical-specific knowledge graph contains broadly 3 types of medical-related information, including disease, symptom and laboratory test. We perform reasoning over the retrieved knowledge graph by reading the triples in each graph using MedFact attention, which allows us to use semantic information from the graphs for better response generation. In order to preserve medical information, we employ a policy network, which effectively injects relevant entities associated with each dialog into the response. We also study how transfer learning can significantly improve the performance by utilizing a relatively small corpus, created by extending the recently released CovidDialog dataset, containing the dialogs for diseases that are symptoms of Covid-19. Empirical results on the MedDialog corpus and the extended CovidDialog dataset demonstrate that our proposed model significantly outperforms the state-of-the-art methods in terms of both automatic evaluation and human judgment. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3531
-
Vasisht 2024
Infusing Knowledge into Large Language Models with Contextual Prompts
arXiv 2024;(): 2024 Ref ID: 8155 Knowledge infusion is a promising method for enhancing Large Language Models for domain-specific NLP tasks rather than pre-training models over large data from scratch. These augmented LLMs typically depend on additional pre-training or knowledge prompts from an existing knowledge graph, which is impractical in many applications. In contrast, knowledge infusion directly from relevant documents is more generalisable and alleviates the need for structured knowledge graphs while also being useful for entities that are usually not found in any knowledge graph. With this motivation, we propose a simple yet generalisable approach for knowledge infusion by generating prompts from the context in the input text. Our experiments show the effectiveness of our approach which we evaluate by probing the fine-tuned LLMs. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2620
-
Vassev 2012
Knowledge representation with KnowLang the marXbot case study
2012 IEEE 11th International Conference on Cybernetic Intelligent Systems (CIS) 2012;():18-23 2012 DOI: 10.1109/CIS.2013.6782155 · Ref ID: 6051 Intelligent systems are capable of AI exhibited via knowledge representation and reasoning, which helps to connect abstract knowledge symbols to real-world meanings. This paper presents a formal language for knowledge representation called KnowLang. The language implies a multi-tier specification model emphasizing knowledge corpuses, knowledge base operators and inference primitives. The approach allows for efficient and comprehensive knowledge structuring where ontologies are integrated with rules and Bayesian networks. The paper presents the KnowLang specification constructs formally along with a case study based on a mobile robotics platform. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1864
-
Venkatakrishnan 2024
Semantic interlinking of Immigration Data using LLMs for Knowledge Graph Construction
WWW 2024 Companion - Companion Proceedings of the ACM Web Conference 2024;():605-608 Association for Computing Machinery, Inc 2024 DOI: 10.1145/3589335.3651557 · Ref ID: 4075 The challenge of managing immigration data is exacerbated by its reliance on paper-based, evidence-driven records maintained by legal professionals, creating obstacles for efficient processing and analysis due to inherent trust issues with AI-based systems. This paper introduces a cutting-edge framework to surmount these hurdles by synergizing Large Language Models (LLMs) with Knowledge Graphs (KGs), revolutionizing traditional data handling methods. Our method transforms archaic, paper-based immigration records into a structured, interconnected knowledge network that intricately mirrors the legal and procedural nuances of immigration, ensuring a dynamic and trustworthy platform for data analysis. Utilizing LLMs, we extract vital entities and relationships from diverse legal documents to forge a comprehensive knowledge graph, encapsulating the complex legalities and procedural disparities in immigration processes and mapping the multifaceted interactions among stakeholders like applicants, sponsors, and legal experts. This graph not only facilitates a deep dive into the legal stipulations but also incorporates them, significantly boosting the system’s reliability and precision. With the integration of Retrieval Augmented Generation (RAG) for exact, context-aware data retrieval and Augmented Knowledge Creation for developing a conversational interface via LLMs, our framework offers a scalable, adaptable solution to immigration data management. This innovative amalgamation of LLMs, KGs, and RAG techniques marks a paradigm shift towards more informed, efficient, and trustworthy decision-making in the sphere of global migration, setting a new benchmark for legal technology and data source management. © 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM. |
yuexi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3730
-
Vijay 2022
NERDA-Con: Extending NER models for Continual Learning – Integrating Distinct Tasks and Updating Distribution Shifts
arXiv 2022;(): 2022 Ref ID: 7562 With increasing applications in areas such as biomedical information extraction pipelines and social media analytics, Named Entity Recognition (NER) has become an indispensable tool for knowledge extraction. However, with the gradual shift in language structure and vocabulary, NERs are plagued with distribution shifts, making them redundant or not as profitable without re-training. Re-training NERs based on Large Language Models (LLMs) from scratch over newly acquired data poses economic disadvantages. In contrast, re-training only with newly acquired data will result in Catastrophic Forgetting of previously acquired knowledge. Therefore, we propose NERDA-Con, a pipeline for training NERs with LLM bases by incorporating the concept of Elastic Weight Consolidation (EWC) into the NER fine-tuning NERDA pipeline. As we believe our work has implications to be utilized in the pipeline of continual learning and NER, we open-source our code as well as provide the fine-tuning library of the same name NERDA-Con at https://github.com/SupritiVijay/NERDA-Con and https://pypi.org/project/NERDA-Con/. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#716
-
Vizcarra 2024
Representing the Interaction between Users and Products via LLM-assisted Knowledge Graph Construction
18th IEEE International Conference on Semantic Computing (ICSC) 2024;():231-232 Laguna Hills, CA Ieee Computer Soc 2024 DOI: 10.1109/icsc59802.2024.00043 · Ref ID: 2985 To understand user behavior, representing the semantic knowledge of user-product interaction is essential. In this paper, we represent the interaction between user and product via large language model (LLM)-assisted knowledge graph construction. We capture users' behavioral actions and static properties of the products from raw text data of "user review" and "product catalog". Moreover, the information needed for updating the knowledge graph is captured by raw texts of "news related to the products". The proposed methodology integrates them as a single knowledge graph to provide causal reasoning on user-product interaction. To alleviate the situation where a small quantity of annotated text exists in these data, we use LLM as a data annotator and augmentor. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3852
-
Vogt 2024
Rosetta Statements: Lowering the Barrier for Semantic Parsing and Increasing the Cognitive Interoperability of Knowledge Graphs
arXiv 2024;(): 2024 Ref ID: 8492 Machines need data and metadata to be machine-actionable and FAIR (findable, accessible, interoperable, reusable) to manage increasing data volumes. Knowledge graphs and ontologies are key to this, but their use is hampered by high access barriers due to required prior knowledge in semantics and data modelling. The Rosetta Statement approach proposes modeling English natural language statements instead of a mind-independent reality. We propose a metamodel for creating semantic schema patterns for simple statement types. The approach supports versioning of statements and provides a detailed editing history. Each Rosetta Statement pattern has a dynamic label for displaying statements as natural language sentences. Implemented in the Open Research Knowledge Graph (ORKG) as a use case, this approach allows domain experts to define data schema patterns without needing semantic knowledge. Future plans include combining Rosetta Statements with semantic units to organize ORKG into meaningful subgraphs, improving usability. A search interface for querying statements without needing SPARQL or Cypher knowledge is also planned, along with tools for data entry and display using Large Language Models and NLP. The Rosetta Statement metamodel supports a two-step knowledge graph construction procedure. Domain experts can model semantic content without support from ontology engineers, lowering entry barriers and increasing cognitive interoperability. The second level involves developing semantic graph patterns for reasoning, requiring collaboration with ontology engineers. |
Mike
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#668
-
Vulie 2020
Probing Pretrained Language Models for Lexical Semantics
Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020;():7222-7240 Electr Network Assoc Computational Linguistics-Acl 2020 Ref ID: 3648 The success of large pretrained language models (LMs) such as BERT and RoBERTa has sparked interest in probing their representations, in order to unveil what types of knowledge they implicitly capture. While prior research focused on morphosyntactic, semantic, and world knowledge, it remains unclear to which extent LMs also derive lexical type-level knowledge from words in context. In this work, we present a systematic empirical analysis across six typologically diverse languages and five different lexical tasks, addressing the following questions: 1) How do different lexical knowledge extraction strategies (monolingual versus multilingual source LM, out-of-context versus in-context encoding, inclusion of special tokens, and layer-wise averaging) impact performance? How consistent are the observed effects across tasks and languages? 2) Is lexical knowledge stored in few parameters, or is it scattered throughout the network? 3) How do these representations fare against traditional static word vectors in lexical tasks? 4) Does the lexical information emerging from independently trained monolingual LMs display latent similarities? Our main results indicate patterns and best practices that hold universally, but also point to prominent variations across languages and tasks. Moreover, we validate the claim that lower Transformer layers carry more type-level lexical knowledge, but also show that this knowledge is distributed across multiple layers. |
Mike
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3339
-
Wadhwa 2024
Distilling Event Sequence Knowledge From Large Language Models
arXiv 2024;(): 2024 Ref ID: 8032 Event sequence models have been found to be highly effective in the analysis and prediction of events. Building such models requires availability of abundant high-quality event sequence data. In certain applications, however, clean structured event sequences are not available, and automated sequence extraction results in data that is too noisy and incomplete. In this work, we explore the use of Large Language Models (LLMs) to generate event sequences that can effectively be used for probabilistic event model construction. This can be viewed as a mechanism of distilling event sequence knowledge from LLMs. Our approach relies on a Knowledge Graph (KG) of event concepts with partial causal relations to guide the generative language model for causal event sequence generation. We show that our approach can generate high-quality event sequences, filling a knowledge gap in the input KG. Furthermore, we explore how the generated sequences can be leveraged to discover useful and more complex structured knowledge from pattern mining and probabilistic event models. We release our sequence generation code and evaluation framework, as well as corpus of event sequence data. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1000
-
Wan 2024
Aspect-Based Sentiment Classification Model Based on Multi-view Information Fusion
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2024;14883 LNCS():16-28 Springer Science and Business Media Deutschland GmbH 2024 DOI: 10.1007/978-981-97-7707-5_2 · Ref ID: 4287 Aspect-based sentiment classification is one of the hot tasks in the field of natural language processing. The task aims to judge the sentiment polarity of the target word, also known as the aspect term, specified in the sentence. The current mainstream models aggregate the information of the aspect term neighbor nodes through the graph neural network model to judge the sentiment polarity. Compared with the previous research, this method has achieved obvious results, but it still faces some problems. First of all, the limited scale of the existing public data set constrains the training of the model, and the general knowledge representation ability has certain deficiencies. Secondly, existing methods use single-view information to judge sentiment polarity, but lack multi-view information and corresponding information fusion methods, the complementarity of sentiment feature information from different perspectives has not been studied. To solve the above problems, an aspect-based sentiment classification model based on multi-view information fusion is proposed. By constructing an inference result set from the large language model (LLM), the LLM’s results are used to enhance the model’s knowledge representation ability. A multi-view information fusion module is proposed to integrate information from two aspects: local fusion and global fusion, and make full use of information from different angles. The experimental results show that the model has higher classification ability than the current mainstream models, and the effectiveness of each module of the model is verified by a variety of experiments. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1894
-
Wang 2024
SOK-Bench: A Situated Video Reasoning Benchmark with Aligned Open-World Knowledge
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2024;():13384-13394 IEEE Computer Society 2024 DOI: 10.1109/CVPR52733.2024.01271 · Ref ID: 4524 Learning commonsense reasoning from visual contexts and scenes in real-world is a crucial step toward advanced artificial intelligence. However, existing video reasoning benchmarks are still inadequate since they were mainly designed for factual or situated reasoning and rarely involve broader knowledge in the real world. Our work aims to delve deeper into reasoning evaluations, specifically within dynamic, open-world, and structured context knowledge. We propose a new benchmark (SOK-Bench), consisting of 44K questions and 10K situations with instance-level annotations depicted in the videos. The reasoning process is required to understand and apply situated knowledge and general knowledge for problem-solving. To create such a dataset, we propose an automatic and scalable gener-ation method to generate question-answer pairs, knowledge graphs, and rationales by instructing the combinations of LLMs and MLLMs. Concretely, we first extract observable situated entities, relations, and processes from videos for situated knowledge and then extend to open-world knowledge beyond the visible content. The task generation is facilitated through multiple dialogues as iterations and subsequently corrected and refined by our designed self-promptings and demonstrations. With a corpus of both explicit situated facts and implicit commonsense, we generate associated question-answer pairs and reasoning processes, finally followed by manual reviews for quality assurance. We evaluated recent mainstream large vision-language models on the benchmark and found several in-sightful conclusions. For more information, please refer to our benchmark at www.bobbywu.com/SOKBench. © 2024 IEEE. |
Mike
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3527
-
Wang 2020
Inductive Learning on Commonsense Knowledge Graph Completion
arXiv 2020;(): 2020 Ref ID: 7407 Commonsense knowledge graph (CKG) is a special type of knowledge graph (KG), where entities are composed of free-form text. However, most existing CKG completion methods focus on the setting where all the entities are presented at training time. Although this setting is standard for conventional KG completion, it has limitations for CKG completion. At test time, entities in CKGs can be unseen because they may have unseen text/names and entities may be disconnected from the training graph, since CKGs are generally very sparse. Here, we propose to study the inductive learning setting for CKG completion where unseen entities may present at test time. We develop a novel learning framework named InductivE. Different from previous approaches, InductiveE ensures the inductive learning capability by directly computing entity embeddings from raw entity attributes/text. InductiveE consists of a free-text encoder, a graph encoder, and a KG completion decoder. Specifically, the free-text encoder first extracts the textual representation of each entity based on the pre-trained language model and word embedding. The graph encoder is a gated relational graph convolutional neural network that learns from a densified graph for more informative entity representation learning. We develop a method that densifies CKGs by adding edges among semantic-related entities and provide more supportive information for unseen entities, leading to better generalization ability of entity embedding for unseen entities. Finally, inductiveE employs Conv-TransE as the CKG completion decoder. Experimental results show that InductiveE significantly outperforms state-of-the-art baselines in both standard and inductive settings on ATOMIC and ConceptNet benchmarks. InductivE performs especially well on inductive scenarios where it achieves above 48% improvement over present methods. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#458
-
Wang 2023
Knowledge Graphs Enhanced Large Language Model Prompt for Electric Power Question Answering
7th International Conference on Electronic Information Technology and Computer Engineering (EITCE) 2023;():24-29 Xiamen, PEOPLES R CHINA Assoc Computing Machinery 2023 DOI: 10.1145/3650400.3650405 · Ref ID: 2925 With the continuous development and digital transformation in the field of electric power, the application of large language models in the electric power industry has become a remarkable trend. The electric power industry is an information-intensive domain involving extensive data processing, predictive analysis, and decision-making. Therefore, the application of large language models in the electric power sector is of great significance. Current large language models such as GPT3.5 and GLM can perform well in tasks such as question answering dialogues. However, these models still face challenges such as answer hallucination and inaccurate responses. This paper proposes a method to enhance question answering in large language models using knowledge graphs, aiming to improve the accuracy and reliability of these models in question answering tasks in the electric power domain. The proposed method first utilizes local electric power data to extract triplets and generate a question answering dataset specific to the electric power domain using a large language model. Then, the relationships of the knowledge graph triplets are incorporated into the question prompt to enhance the quality of the model's answers. Furthermore, we fine-tune the large language model using the expanded question set derived from the triplets as knowledge enhanced data. Subsequently, we conduct experiments on both an electric power question answering dataset and a knowledge graph question answering dataset. The experimental results demonstrate that our method significantly improves various metrics of the large language model in the electric power question answering task. This research provides new insights and approaches to enhance the effectiveness of question answering systems in the electric power domain. Future studies can further explore and optimize this prompt expansion method for application in broader domains and tasks. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1694
-
Wang 2021
Novel semantic retrieval approach for semi-structured knowledge in industrial software development
Jisuanji Jicheng Zhizao Xitong 2021;27(8):2371-2381 2021 DOI: 10.13196/j.cims.2021.08.019 · Ref ID: 5506 In knowledge-driven industrial software development, assisting engineers in searching heterogeneous semi-structured knowledge efficiently and accurately is a major issue. A semantic retrieval method was proposed based on the knowledge super network model. The knowledge super network consisting of product subnet, object subnet, and knowledge subnet was built with the relations between the concepts of code reuse and the attributes of engineering knowledge. To calculate the process context correlation between user query and engineering knowledge, the conceptual knowledge and language model were integrated by Bayesian method. Experimental results on Microsoft knowledge base dataset show that the proposed approach could improve the precision of knowledge retrieval comparing to several semantic retrieval methods. The feasibility and effectiveness of the approach were also verified. © 2021, Editorial Department of CIMS. All right reserved. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#83
-
Wang 2021
Can Generative Pre-trained Language Models Serve as Knowledge Bases for Closed-book QA?
Joint Conference of 59th Annual Meeting of the Association-for-Computational-Linguistics (ACL) / 11th International Joint Conference on Natural Language Processing (IJCNLP) / 6th Workshop on Representation Learning for NLP (RepL4NLP) 2021;():3241-3251 Electr Network Assoc Computational Linguistics-Acl 2021 Ref ID: 3546 Recent work has investigated the interesting question using pre-trained language models (PLMs) as knowledge bases for answering open questions. However, existing work is limited in using small benchmarks with high test-train overlaps. We construct a new dataset of closed-book QA using SQuAD, and investigate the performance of BART. Experiments show that it is challenging for BART to remember training facts in high precision, and also challenging to answer closed-book questions even if relevant knowledge is retained. Some promising directions are found, including decoupling the knowledge memorizing process and the QA finetune process, forcing the model to recall relevant knowledge when question answering. |
Xinchen
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#517
-
Wang 2024
Let Me Show You Step by Step: An Interpretable Graph Routing Network for Knowledge-based Visual Question Answering
47th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) 2024;():1984-1994 Washington, DC Assoc Computing Machinery 2024 DOI: 10.1145/3626772.3657790 · Ref ID: 3424 Visual Question Answering based on external Knowledge Bases (KB-VQA) requires a model to incorporate knowledge beyond the content of given image and question for answer prediction. Most existing works made efforts on using graph neural networks or Multi-modal Large Language Models to incorporate external knowledge for answer generation. Despite the promising results, they have limited interpretability and exhibit a deficiency in handling questions with unseen answers. In this paper, we propose a novel interpretable graph routing network (GRN) which explicitly conducts entity routing over a constructed scene knowledge graph step by step for KB-VQA. At each step, GRN keeps an entity score vector representing how likely of each entity to be activated as the answer, and a transition matrix representing the transition probability from one entity to another. To answer the given question, GRN will focus on certain keywords of the question at each step and correspondingly conduct entity routing by transiting the entity scores according to the transition matrix computed referring to the focused question keywords. In this way, it clearly provides the reasoning process of KB-VQA and can handle the questions with unseen answers without distinction. Experiments on the benchmark dataset KRVQA have demonstrated that GRN improves the performance of KB-VQA by a large margin, surpassing existing state-of-the art KB-VQA methods and Multi-modal Large Language Models, as well as shows competent capability in handling unseen answers and good interpretability in KB-VQA. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3194
-
Wang 2024
Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language Models
arXiv 2024;(): 2024 Ref ID: 8678 Retrieval-Augmented Generation (RAG), while effective in integrating external knowledge to address the limitations of large language models (LLMs), can be undermined by imperfect retrieval, which may introduce irrelevant, misleading, or even malicious information. Despite its importance, previous studies have rarely explored the behavior of RAG through joint analysis on how errors from imperfect retrieval attribute and propagate, and how potential conflicts arise between the LLMs' internal knowledge and external sources. We find that imperfect retrieval augmentation might be inevitable and quite harmful, through controlled analysis under realistic conditions. We identify the knowledge conflicts between LLM-internal and external knowledge from retrieval as a bottleneck to overcome in the post-retrieval stage of RAG. To render LLMs resilient to imperfect retrieval, we propose Astute RAG, a novel RAG approach that adaptively elicits essential information from LLMs' internal knowledge, iteratively consolidates internal and external knowledge with source-awareness, and finalizes the answer according to information reliability. Our experiments using Gemini and Claude demonstrate that Astute RAG significantly outperforms previous robustness-enhanced RAG methods. Notably, Astute RAG is the only approach that matches or exceeds the performance of LLMs without RAG under worst-case scenarios. Further analysis reveals that Astute RAG effectively resolves knowledge conflicts, improving the reliability and trustworthiness of RAG systems. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1786
-
Wang 2024
A Real-Time Rumor Detection Method Based on the Graph Attention Neural Network Integrated with the Knowledge Graph
Data. Anal. Knowl. Discov. 2024;8(6):95-106 2024 DOI: 10.11925/infotech.2096-3467.2023.0314 · Ref ID: 3955 [Objective] This paper aims to improve the accuracy of real-time rumor detection in social media and reduce the harm caused by rumors. [Methods] A real-time rumor detection method based on the graph attention neural network integrated with the knowledge graph is proposed. First, the background knowledge of the text is obtained from the external knowledge graph by knowledge distillation. Second, we transformed the text and background knowledge into a weighted graph structure representation by point mutual information, and a weighted graph attention neural network is used to learn the discontinuous semantic features of the text from the weighted graph. Then, the continuous semantic features of the text are learned by the pre-trained language model BERT, and the statistical features of users and content are converted into continuous vector representations using the embedding method. Finally, all the features are fused and input into the fully connected neural network for rumor detection. [Results] Experimental results on two public social media rumor datasets, PHEME and WEIBO, show that the method's accuracy reaches 92.1% and 84.0%, respectively, higher than the state-of-the-art baseline methods. [Limitations] The method does not fuse the image or video information that may be attached to the post and cannot perform multi-modal fusion rumor detection. [Conclusions] Fusion of background knowledge can complement the semantic representation of short texts. Fusion of user and content statistical features can support semantic features in decision making and improve the accuracy of the model. © 2024 Chinese Academy of Sciences. All rights reserved. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#593
-
Wang 2024
Multivariate graph neural networks on enhancing syntactic and semantic for aspect-based sentiment analysis
Aspect-based sentiment analysis (ABSA) aims to predict sentiment orientations towards textual aspects by extracting insights from user comments. While pretrained large language models (LLMs) demonstrate proficiency in sentiment analysis, incorporating syntactic and semantic features into ABSA remains a challenge. Additionally, employing LLMs for sentiment analysis often requires significant computational resources, rendering them impractical for use by individuals or small-scale entities. To address this, we propose the semiotic signal integration network (SSIN), which effectively combines syntactic and semantic features. The core syncretic information network leverages isomorphism and syntax to enhance knowledge acquisition. The semantically guided syntactic attention module further enables integrated semiotic representations via sophisticated attention mechanisms. Experiments on the publicly available SemEval dataset show that SSIN performs better than existing state-of-the-art ABSA baselines and LLMs such as Llama and Alpaca with high accuracy and macro-F1 scores. Moreover, our model demonstrates exceptional interpretability and the ability to discern both positive and negative sentiments, which is vitally important for real-world applications such as social media monitoring, health care, and customer service. Code is available at https://github.com/AmbitYuki/SSIN. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3230
-
Wang 2024
BlendFilter: Advancing Retrieval-Augmented Large Language Models via Query Generation Blending and Knowledge Filtering
arXiv 2024;(): 2024 Ref ID: 8110 Retrieval-augmented Large Language Models (LLMs) offer substantial benefits in enhancing performance across knowledge-intensive scenarios. However, these methods often face challenges with complex inputs and encounter difficulties due to noisy knowledge retrieval, notably hindering model effectiveness. To address this issue, we introduce BlendFilter, a novel approach that elevates retrieval-augmented LLMs by integrating query generation blending with knowledge filtering. BlendFilter proposes the blending process through its query generation method, which integrates both external and internal knowledge augmentation with the original query, ensuring comprehensive information gathering. Additionally, our distinctive knowledge filtering module capitalizes on the intrinsic capabilities of the LLM, effectively eliminating extraneous data. We conduct extensive experiments on three open-domain question answering benchmarks, and the findings clearly indicate that our innovative BlendFilter surpasses state-of-the-art baselines significantly. |
yuexi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3030
-
Wang 2006
Towards Representing FCA-based Ontologies in Semantic Web Rule Language
The Sixth IEEE International Conference on Computer and Information Technology (CIT'06) 2006;():41-41 2006 DOI: 10.1109/CIT.2006.186 · Ref ID: 6139 Formal Concept Analysis (FCA) has been widely applied in many fields recently. In this paper, we introduce how a domain ontology can be constructed based on FCA. The consequential ontology constructed in this way is graphically represented as a concept lattice. After constructing FCA-based ontologies, it is necessary to represent the FCAbased ontologies in a formalism for sharing and reasoning. Semantic Web Rule Language(SWRL), a W3C proposal, is an extension with Horn clause rules on OWL. We represent the FCA-based ontologies in SWRL with some extension, which can be more suitable for reasoning. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#360
-
Wang 2024
Improving the Robustness of Knowledge -Grounded Dialogue via Contrastive Learning
38th AAAI Conference on Artificial Intelligence (AAAI) / 36th Conference on Innovative Applications of Artificial Intelligence / 14th Symposium on Educational Advances in Artificial Intelligence 2024;():19135-19143 Vancouver, CANADA Assoc Advancement Artificial Intelligence 2024 Ref ID: 3483 Knowledge -grounded dialogue (KGD) learns to generate an informative response based on a given dialogue context and external knowledge (e.g., knowledge graphs; KGs). Recently, the emergence of large language models (LLMs) and pre training techniques has brought great success to knowledge grounded dialogue. However, when building KGD systems in real applications, there are various real-world noises that are inevitable to face. For example, the dialogue context might involve perturbations such as misspellings and abbreviations. In addition, KGs typically suffer from incompletion and also might contain erroneous and outdated facts. Such real-world noises pose a challenge to the robustness of KGD systems and hinder their applications in the real world. In this paper, we propose an entity-based contrastive learning framework for improving the robustness of KGD. Specifically, we make use of the entity information in a KGD sample to create both its positive and negative samples which involve semantic irrelevant and semantic-relevant perturbations, respectively. The contrastive learning framework ensures the KGD model is aware of these two types of perturbations, thus generating informative responses with the potentially noisy inputs in real applications. Experimental results on three benchmark datasets show that our method achieves new state-of-the-art performance in terms of automatic evaluation scores, verifying its effectiveness and potentiality. Furthermore, we show that our method can generate better responses than comparison models in both the noisy and the few -shot settings. |
Mike
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3117
-
Wang 2024
Causal-driven Large Language Models with Faithful Reasoning for Knowledge Question Answering
Proceedings of the 32nd ACM International Conference on Multimedia 2024;():4331–4340 Melbourne VIC, Australia Association for Computing Machinery 2024 DOI: 10.1145/3664647.3681263 · Ref ID: 7306 |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3557
-
Wang 2024
JMLR: Joint Medical LLM and Retrieval Training for Enhancing Reasoning and Professional Question Answering Capability
arXiv 2024;(): 2024 Ref ID: 8145 Large Language Models (LLMs) have demonstrated a remarkable potential in medical knowledge acquisition and question-answering. However, LLMs can potentially hallucinate and yield factually incorrect outcomes, even with domain-specific pretraining. Previously, retrieval augmented generation (RAG) has limited success in addressing hallucinations. Unlike previous methods in RAG where the retrieval model was trained separately from the LLM, we introduce JMLR (for Jointly trains LLM and information Retrieval) during the fine-tuning phase. The synchronized training mechanism enhances JMLR's ability to retrieve clinical guidelines and leverage medical knowledge to reason and answer questions and reduces the demand for computational resources. We evaluated JMLR on the important medical question-answering application. Our experimental results demonstrate that JMLR-13B (70.5%) outperforms a previous state-of-the-art open-source model using conventional pre-training and fine-tuning Meditron-70B (68.9%) and Llama2-13B with RAG (67.7%) on a medical question-answering dataset. Comprehensive evaluations reveal JMLR-13B enhances reasoning quality and reduces hallucinations better than Claude3-Opus. Additionally, JMLR-13B (148 GPU hours) also trains much faster than Meditron-70B (42630 GPU hours). Through this work, we provide a new and efficient knowledge enhancement method for healthcare, demonstrating the potential of integrating retrieval and LLM training for medical question-answering systems. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1607
-
Wang 2024
LLM as Prompter: Low-resource Inductive Reasoning on Arbitrary Knowledge Graphs
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():3742-3759 Association for Computational Linguistics (ACL) 2024 Ref ID: 4284 Knowledge Graph (KG) inductive reasoning, which aims to infer missing facts from new KGs that are not seen during training, has been widely adopted in various applications. One critical challenge of KG inductive reasoning is handling low-resource scenarios with scarcity in both textual and structural aspects. In this paper, we attempt to address this challenge with Large Language Models (LLMs). Particularly, we utilize the state-of-the-art LLMs to generate a graph-structural prompt to enhance the pre-trained Graph Neural Networks (GNNs), which brings us new methodological insights into the KG inductive reasoning methods, as well as high generalizability in practice. On the methodological side, we introduce a novel pretraining and prompting framework PROLINK, designed for low-resource inductive reasoning across arbitrary KGs without requiring additional training. On the practical side, we experimentally evaluate our approach on 36 low-resource KG datasets and find that PROLINK outperforms previous methods in three-shot, one-shot, and zero-shot reasoning tasks, exhibiting average performance improvements by 20%, 45%, and 147%, respectively. Furthermore, PROLINK demonstrates strong robustness for various LLM promptings as well as full-shot scenarios. Our source code is available on https://github.com/KyneWang/ProLINK. © 2024 Association for Computational Linguistics. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3678
-
Wang 2024
LLMs Know What They Need: Leveraging a Missing Information Guided Framework to Empower Retrieval-Augmented Generation
arXiv 2024;(): 2024 Ref ID: 8246 Retrieval-Augmented Generation (RAG) demonstrates great value in alleviating outdated knowledge or hallucination by supplying LLMs with updated and relevant knowledge. However, there are still several difficulties for RAG in understanding complex multi-hop query and retrieving relevant documents, which require LLMs to perform reasoning and retrieve step by step. Inspired by human's reasoning process in which they gradually search for the required information, it is natural to ask whether the LLMs could notice the missing information in each reasoning step. In this work, we first experimentally verified the ability of LLMs to extract information as well as to know the missing. Based on the above discovery, we propose a Missing Information Guided Retrieve-Extraction-Solving paradigm (MIGRES), where we leverage the identification of missing information to generate a targeted query that steers the subsequent knowledge retrieval. Besides, we design a sentence-level re-ranking filtering approach to filter the irrelevant content out from document, along with the information extraction capability of LLMs to extract useful information from cleaned-up documents, which in turn to bolster the overall efficacy of RAG. Extensive experiments conducted on multiple public datasets reveal the superiority of the proposed MIGRES method, and analytical experiments demonstrate the effectiveness of our proposed modules. |
Kwesi
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#533
-
Wang 2024
LLM-Assisted Analytics in Semiconductor Test (Invited)
6th International Symposium on Machine Learning for CAD (MLCAD) 2024;(): Snowbird, UT Assoc Computing Machinery 2024 DOI: 10.1145/3670474.3685974 · Ref ID: 3222 The emergence of Large Language Models (LLMs) has impacted our perspective on applying Machine Learning (ML) in semiconductor test. This paper shares our experience in leveraging the power of LLMs to build an AI agent for test data analytics. We advocate for an end-to-end approach where the Knowledge Graph (KG) plays a central role. Using wafermap analytics as an example, we highlight the key ideas behind developing the LLM-assisted AI agent named IEA-Plot, and discuss its practical applications. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#454
-
Wang 2023
Knowledge Graph-Based Method for Intelligent Generation of Emergency Plans for Water Conservancy Projects
In response to the issues of poor content correlation and insufficient intelligent decision support in emergency plans for water conservancy projects, a method for intelligent generation of emergency plans based on knowledge graphs is proposed. Utilizing pre-trained language models (PTM) based on entity masking, the accuracy of entity recognition tasks is enhanced by uncovering contextual features surrounding the masked entities. By employing translations, rotations, and superpositions within the vector space, a multiview convolutional neural network (MCNN) is constructed to enhance the accuracy of relation extraction through complementary and integrated feature representation. Integrating PTM with MCNN enables the construction of an emergency entity relationship extraction method based on PTM-MCNN. Neo4j is utilized for storing entity relationship triplets to construct an emergency knowledge graph. Through the utilization of the mutual information criterion, knowledge retrieval and matching are performed to accomplish the intelligent generation of emergency plans. The results indicate that PTM-MCNN achieves high recognition accuracy (F1 score of 92.2%), ensuring the reliability of the generated emergency plans. Related studies can effectively improve the intelligence of emergency management of water conservancy projects. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#772
-
Wang 2022
SimKGC: Simple Contrastive Knowledge Graph Completion with Pre-trained Language Models
60th Annual Meeting of the Association-for-Computational-Linguistics (ACL) 2022;():4281-4294 Dublin, IRELAND Assoc Computational Linguistics-Acl 2022 Ref ID: 3405 Knowledge graph completion (KGC) aims to reason over known facts and infer the missing links. Text-based methods such as KG-BERT (Yao et al., 2019) learn entity representations from natural language descriptions, and have the potential for inductive KGC. However, the performance of text-based methods still largely lag behind graph embedding-based methods like TransE (Bordes et al., 2013) and RotatE (Sun et al., 2019b). In this paper, we identify that the key issue is efficient contrastive learning. To improve the learning efficiency, we introduce three types of negatives: in-batch negatives, pre-batch negatives, and self-negatives which act as a simple form of hard negatives. Combined with InfoNCE loss, our proposed model SimKGC can substantially outperform embedding-based methods on several benchmark datasets. In terms of mean reciprocal rank (MRR), we advance the state-of-the-art by +19% on WN18RR, +6.8% on the Wikidata5M transductive setting, and +22% on the Wikidata5M inductive setting. Thorough analyses are conducted to gain insights into each component. Our code is available at https://github.com/intfloat/SimKGC. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1150
-
Wang 2023
Cross-Modal Knowledge Discovery, Inference, and Challenges
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2023;13759 LNCS():199-209 Springer Science and Business Media Deutschland GmbH 2023 DOI: 10.1007/978-3-031-31414-8_6 · Ref ID: 5271 In recent years, multimodal knowledge has become a popular research topic in many fields, such as knowledge graphs and natural language processing. Multimodal knowledge involves multimodal knowledge graphs, multimodal pre-trained language models, multimodal knowledge inference, etc.; from online shopping to medical care, whether it is theoretical research or engineering application, the knowledge representation, discovery, and inference of multimodal knowledge have become the core technologies of the academic and industrial concern. This tutorial focuses on the state of the art of cross-modal knowledge discovery and inference and presents future research opportunities and challenges. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1789
-
Wang 2023
Reasoning Through Memorization: Nearest Neighbor Knowledge Graph Embeddings
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2023;14302 LNAI():111-122 Springer Science and Business Media Deutschland GmbH 2023 DOI: 10.1007/978-3-031-44693-1_9 · Ref ID: 5141 Previous knowledge graph embedding approaches usually map entities to representations and utilize score functions to predict the target entities, yet they typically struggle to reason rare or emerging unseen entities. In this paper, we propose kNN-KGE, a new knowledge graph embedding approach with pre-trained language models, by linearly interpolating its entity distribution with k-nearest neighbors. We compute the nearest neighbors based on the distance in the entity embedding space from the knowledge store. Our approach can allow rare or emerging entities to be memorized explicitly rather than implicitly in model parameters. Experimental results demonstrate that our approach can improve inductive and transductive link prediction results and yield better performance for low-resource settings with only a few triples, which might be easier to reason via explicit memory (Code is available at: https://github.com/zjunlp/KNN-KG ). © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG. |
Kwesi
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1636
-
Wang 2023
A Medical Question Classification Approach Based on Prompt Tuning and Contrastive Learning
Proceedings of the International Conference on Software Engineering and Knowledge Engineering, SEKE 2023;2023-July():632-635 Knowledge Systems Institute Graduate School 2023 DOI: 10.18293/SEKE2023-025 · Ref ID: 5292 COVID-19 has profoundly impacted people's lives, and people are more concerned about medical and health issues, so it is essential to design an efficient method for classifying medical questions. Fine-tuning paradigms based on pre-trained language models have proven effective in recent years. However, PLMs based on fine-tuning paradigms are poorly robust, and there is a gap between the pre-training phase and the downstream task form, resulting in PLMs that cannot use the rich latent knowledge in downstream tasks. We propose a medical question classification method that combines prompt fine-tuning and contrastive learning and uses the large-scale knowledge graph enhancement model ERNIE 3.0 as a feature extractor to address both problems. Our approach utilizes an additional prompt template to enable PLM to unleash the potential in specific tasks and uses a contrast sample strategy to alleviate the problem of confusable samples that are difficult to distinguish. Experiments on a medical question classification dataset show that the method achieves an accuracy of 93.65 percent, with better metrics than recent work. © 2023 Knowledge Systems Institute Graduate School. All rights reserved. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1389
-
Wang 2024
IDEATE: Detecting AI-Generated Text using Internal and External Factual Structures
2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():8556-8568 European Language Resources Association (ELRA) 2024 Ref ID: 4590 The effective detection of AI-generated text is a vital principle to ensure responsible use of large language models (LLMs). Previous studies mainly focused on discovering and utilizing internal evidences contained in the text itself to perform the detection, while ignoring external evidences implicated in an established knowledge graph (KG) which may also be key discriminative factors between AI-generated and human-written text. To address this deficiency, we propose IDEATE, a novel hierarchical graph network that utilizes both internal and external factual structures to detect AI-generated text. IDEATE consists of a mention-level subgraph at the bottom to describe internal factual structures of mentioned entities reflected in the input text, and an entity-level subgraph at the top to describe external factual structures of mentioned entities reflected in an external KG. Hierarchical graph convolution is then applied successively on the two subgraphs, through which the two types of factual structures will be embedded into the output and used for the final detection. Extensive experiments on four benchmarking datasets show that IDEATE consistently outperforms current state-of-the-art methods in detecting text generated by various LLMs, ranging from GPT-2 to the more powerful ChatGPT, verifying the necessity and superiority of introducing external evidences for AI-generated text detection. © 2024 ELRA Language Resource Association: CC BY-NC 4.0. |
yuexi
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1634
-
Wang 2024
Medical Knowledge Graph Question-Answering System Based on Hybrid Dynamic Masking and Multi-strategy Fusion
J. Frontier. Comput. Sci. Technol. 2024;18(10):2770-2786 2024 DOI: 10.3778/j.issn.1673-9418.2401072 · Ref ID: 3854 Medical knowledge graph question-answering combines medical knowledge and natural language processing technology to provide accurate and fast question-answering services for medical practitioners and patients. However, the current Chinese medical knowledge graphs are not comprehensive enough due to the surge in data. Additionally, the complex and ambiguous nature of medical questions poses a significant challenge in accurately identifying entity information and generating answers that are both easily comprehensible and accessible to the public. This paper proposes a medical knowledge graph question-answering framework based on hybrid dynamic masking and multi-strategy fusion. Initially, a medical knowledge graph encompassing 34167 entities and 297463 relationships is constructed by integrating public datasets and disease knowledge from medical platforms, covering categories such as diseases, medications, and food. Subsequently, a BERT-MaskAttention-BiLSTM-CRF hybrid dynamic masking model is introduced to accurately identify medical entity information in the input, effectively focusing on essential content and eliminating interference from redundant information. Finally, entity alignment strategies are employed to unify and standardize medical entities, while intent recognition strategies delve into users’query intentions. This is coupled with the use of large language models to refine the output from the knowledge graph, ensuring that the responses are more readily comprehensible. Experimental results demonstrate that the model achieves a macro-average F1 score of 0.9602 in entity recognition comparative experiments and an average accuracy of 0.9656 in question-answering tests. The generated content is more easily comprehensible and interpretable. © 2024 Journal of Computer Engineering and Applications Beijing Co., Ltd.; Science Press. All rights reserved. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#692
-
Wang 2023
Query Structure Modeling for Inductive Logical Reasoning Over Knowledge Graphs
61st Annual Meeting of the the Association-for-Computational-Linguistics (ACL) 2023;():4706-4718 Toronto, CANADA Assoc Computational Linguistics-Acl 2023 Ref ID: 3194 Logical reasoning over incomplete knowledge graphs to answer complex logical queries is a challenging task. With the emergence of new entities and relations in constantly evolving KGs, inductive logical reasoning over KGs has become a crucial problem. However, previous PLMs-based methods struggle to model the logical structures of complex queries, which limits their ability to generalize within the same structure. In this paper, we propose a structuremodeled textual encoding framework for inductive logical reasoning over KGs. It encodes linearized query structures and entities using pre-trained language models to find answers. For structure modeling of complex queries, we design stepwise instructions that implicitly prompt PLMs on the execution order of geometric operations in each query. We further separately model different geometric operations (i.e., projection, intersection, and union) on the representation space using a pre-trained encoder with additional attention and maxout layers to enhance structured modeling. We conduct experiments on two inductive logical reasoning datasets and three transductive datasets. The results demonstrate the effectiveness of our method on logical reasoning over KGs in both inductive and transductive settings. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3709
-
Wang 2024
MGSA: Multi-Granularity Graph Structure Attention for Knowledge Graph-to-Text Generation
arXiv 2024;(): 2024 Ref ID: 8602 The Knowledge Graph-to-Text Generation task aims to convert structured knowledge graphs into coherent and human-readable natural language text. Recent efforts in this field have focused on enhancing pre-trained language models (PLMs) by incorporating graph structure information to capture the intricate structure details of knowledge graphs. However, most of these approaches tend to capture only single-granularity structure information, concentrating either on the relationships between entities within the original graph or on the relationships between words within the same entity or across different entities. This narrow focus results in a significant limitation: models that concentrate solely on entity-level structure fail to capture the nuanced semantic relationships between words, while those that focus only on word-level structure overlook the broader relationships between original entire entities. To overcome these limitations, this paper introduces the Multi-granularity Graph Structure Attention (MGSA), which is based on PLMs. The encoder of the model architecture features an entity-level structure encoding module, a word-level structure encoding module, and an aggregation module that synthesizes information from both structure. This multi-granularity structure encoding approach allows the model to simultaneously capture both entity-level and word-level structure information, providing a more comprehensive understanding of the knowledge graph's structure information, thereby significantly improving the quality of the generated text. We conducted extensive evaluations of the MGSA model using two widely recognized KG-to-Text Generation benchmark datasets, WebNLG and EventNarrative, where it consistently outperformed models that rely solely on single-granularity structure information, demonstrating the effectiveness of our approach. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#784
-
Wang 2023
SSKGE: a time-saving knowledge graph embedding framework based on structure enhancement and semantic guidance
In knowledge graph embedding, an attempt is made to embed the objective facts and relationships expressed in the form of triplets into multidimensional vector space, facilitating various applications, such as link prediction and question answering. Structure embedding models focus on the graph structure while the importance of language semantics in inferring similar entities and relations is ignored. Semantic embedding models use pretrained language models to learn entity and relation embeddings based on text information, but they do not fully exploit graph structures that reflect relation patterns and mapping attributes. Structure and semantic information in knowledge graphs represent different hierarchical properties that are indispensable for comprehensive knowledge representation. In this paper, we propose a general knowledge graph embedding framework named SSKGE, which considers both the graph structure and language semantics and learns these two complementary characteristics to integrate entity and relation representations. To compensate for semantic embedding approaches that ignore the graph structure, we first design a structure loss function to explicitly model the graph structure attributes. Second, we leverage a pretrained language model that has been fine-tuned by the structure loss to guide the structure embedding approaches in enhancing the semantic information they lack and obtaining universal knowledge representations. Specifically, guidance is provided by a distance function that makes the spatial distribution of the two types of graph embeddings have a certain similarity. SSKGE significantly reduces the time cost of using a pretrained language model to complete a knowledge graph. Common knowledge graph embedding models such as TransE, DistMult, ComplEx, RotatE, PairRE, and HousE have achieved better results with multiple datasets, including FB15k, FB15k-237, WN18, and WN18RR, using the SSKGE framework. Extensive experiments and analyses have verified the effectiveness and practicality of SSKGE. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3358
-
Wang 2024
The Earth is Flat? Unveiling Factual Errors in Large Language Models
arXiv 2024;(): 2024 Ref ID: 8021 Large Language Models (LLMs) like ChatGPT are foundational in various applications due to their extensive knowledge from pre-training and fine-tuning. Despite this, they are prone to generating factual and commonsense errors, raising concerns in critical areas like healthcare, journalism, and education to mislead users. Current methods for evaluating LLMs' veracity are limited by test data leakage or the need for extensive human labor, hindering efficient and accurate error detection. To tackle this problem, we introduce a novel, automatic testing framework, FactChecker, aimed at uncovering factual inaccuracies in LLMs. This framework involves three main steps: First, it constructs a factual knowledge graph by retrieving fact triplets from a large-scale knowledge database. Then, leveraging the knowledge graph, FactChecker employs a rule-based approach to generates three types of questions (Yes-No, Multiple-Choice, and WH questions) that involve single-hop and multi-hop relations, along with correct answers. Lastly, it assesses the LLMs' responses for accuracy using tailored matching strategies for each question type. Our extensive tests on six prominent LLMs, including text-davinci-002, text-davinci-003, ChatGPT (gpt-3.5-turbo, gpt-4), Vicuna, and LLaMA-2, reveal that FactChecker can trigger factual errors in up to 45% of questions in these models. Moreover, we demonstrate that FactChecker's test cases can improve LLMs' factual accuracy through in-context learning and fine-tuning (e.g., llama-2-13b-chat's accuracy increase from 35.3% to 68.5%). We are making all code, data, and results available for future research endeavors. |
yuexi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1556
-
Wang 2022
Language Models as Knowledge Embeddings
IJCAI International Joint Conference on Artificial Intelligence 2022;():2291-2297 International Joint Conferences on Artificial Intelligence 2022 Ref ID: 5443 Knowledge embeddings (KE) represent a knowledge graph (KG) by embedding entities and relations into continuous vector spaces. Existing methods are mainly structure-based or description-based. Structure-based methods learn representations that preserve the inherent structure of KGs. They cannot well represent abundant long-tail entities in real-world KGs with limited structural information. Description-based methods leverage textual information and language models. Prior approaches in this direction barely outperform structure-based ones, and suffer from problems like expensive negative sampling and restrictive description demand. In this paper, we propose LMKE, which adopts Language Models to derive Knowledge Embeddings, aiming at both enriching representations of long-tail entities and solving problems of prior description-based methods. We formulate description-based KE learning with a contrastive learning framework to improve efficiency in training and evaluation. Experimental results show that LMKE achieves state-of-the-art performance on KE benchmarks of link prediction and triple classification, especially for long-tail entities. © 2022 International Joint Conferences on Artificial Intelligence. All rights reserved. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1919
-
Wang 2023
A Survey of Pre-Trained Language Models IncorporatingKnowledge Graphs
2023 IEEE International Conference on Electrical, Automation and Computer Engineering, ICEACE 2023 2023;():1706-1710 Institute of Electrical and Electronics Engineers Inc. 2023 DOI: 10.1109/ICEACE60673.2023.10442824 · Ref ID: 4986 Pre-trained models acquire knowledge from vast amounts of unannotated and unstructured data through self-supervised learning. However, they suffer from limitations such as inadequate performance and limited knowledge reasoning capabilities due to the lack of external knowledge guidance. To address these limitations, integrating structured knowledge from knowledge graphs into pretrained models enables them to acquire both general semantic knowledge from free text and real-world knowledge behind the text, thereby effectively addressing downstream knowledge-driven tasks. This paper introduces the concepts of pretrained models and knowledge graphs, discusses research advancements, provides an overview of methods for integrating knowledge into pretrained models, and proposes three classification approaches based on fusion methods. It also outlines the application domains where these approaches can be applied. Finally, the paper summarizes and discusses future research directions for pretrained models Integrated with knowledge. © 2023 IEEE. |
Davis
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1533
-
Wang 2023
Knowledge-enhanced Pre-Training large language model for depression diagnosis and treatment
Proceeding of 2023 9th IEEE International Conference on Cloud Computing and Intelligence Systems, CCIS 2023 2023;():532-536 Institute of Electrical and Electronics Engineers Inc. 2023 DOI: 10.1109/CCIS59572.2023.10263217 · Ref ID: 5188 Depression, a pervasive psychiatric disorder characterized by concealment, dependence on expert judgment, and a notable rate of misdiagnosis, poses a substantial burden on society. To enhance the diagnosis and treatment of depression, this study puts forth a proposition of employing knowledge-enhanced pre-Training technology leveraging large language models. By integrating domain knowledge and depression knowledge graph directives, the pre-Trained model undergoes optimization. Expert involvement in depression diagnosis and treatment fosters a guided learning process facilitated by expert feedback. Through the application of dialogue therapy, the efficacy of treatment is augmented. This technical approach aims to ameliorate the societal burden by improving the diagnosis and treatment of depressed individuals. © 2023 IEEE. |
Kwesi
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3148
-
Wang 2024
Medical knowledge graph completion via fusion of entity description and type information
|
Ishan
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3175
-
Wang 2024
AceMap: Knowledge Discovery through Academic Graph
arXiv 2024;(): 2024 Ref ID: 8160 The exponential growth of scientific literature requires effective management and extraction of valuable insights. While existing scientific search engines excel at delivering search results based on relational databases, they often neglect the analysis of collaborations between scientific entities and the evolution of ideas, as well as the in-depth analysis of content within scientific publications. The representation of heterogeneous graphs and the effective measurement, analysis, and mining of such graphs pose significant challenges. To address these challenges, we present AceMap, an academic system designed for knowledge discovery through academic graph. We present advanced database construction techniques to build the comprehensive AceMap database with large-scale academic entities that contain rich visual, textual, and numerical information. AceMap also employs innovative visualization, quantification, and analysis methods to explore associations and logical relationships among academic entities. AceMap introduces large-scale academic network visualization techniques centered on nebular graphs, providing a comprehensive view of academic networks from multiple perspectives. In addition, AceMap proposes a unified metric based on structural entropy to quantitatively measure the knowledge content of different academic entities. Moreover, AceMap provides advanced analysis capabilities, including tracing the evolution of academic ideas through citation relationships and concept co-occurrence, and generating concise summaries informed by this evolutionary process. In addition, AceMap uses machine reading methods to generate potential new ideas at the intersection of different fields. Exploring the integration of large language models and knowledge graphs is a promising direction for future research in idea evolution. Please visit \url{https://www.acemap.info} for further exploration. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3858
-
Wang 2024
SciDaSynth: Interactive Structured Knowledge Extraction and Synthesis from Scientific Literature with Large Language Model
arXiv 2024;(): 2024 Ref ID: 8243 Extraction and synthesis of structured knowledge from extensive scientific literature are crucial for advancing and disseminating scientific progress. Although many existing systems facilitate literature review and digest, they struggle to process multimodal, varied, and inconsistent information within and across the literature into structured data. We introduce SciDaSynth, a novel interactive system powered by large language models (LLMs) that enables researchers to efficiently build structured knowledge bases from scientific literature at scale. The system automatically creates data tables to organize and summarize users' interested knowledge in literature via question-answering. Furthermore, it provides multi-level and multi-faceted exploration of the generated data tables, facilitating iterative validation, correction, and refinement. Our within-subjects study with researchers demonstrates the effectiveness and efficiency of SciDaSynth in constructing quality scientific knowledge bases. We further discuss the design implications for human-AI interaction tools for data extraction and structuring. |
Xinchen
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3640
-
Wang 2024
LCMDC: Large-scale Chinese Medical Dialogue Corpora for Automatic Triage and Medical Consultation
arXiv 2024;(): 2024 Ref ID: 8660 The global COVID-19 pandemic underscored major deficiencies in traditional healthcare systems, hastening the advancement of online medical services, especially in medical triage and consultation. However, existing studies face two main challenges. First, the scarcity of large-scale, publicly available, domain-specific medical datasets due to privacy concerns, with current datasets being small and limited to a few diseases, limiting the effectiveness of triage methods based on Pre-trained Language Models (PLMs). Second, existing methods lack medical knowledge and struggle to accurately understand professional terms and expressions in patient-doctor consultations. To overcome these obstacles, we construct the Large-scale Chinese Medical Dialogue Corpora (LCMDC), comprising a Coarse-grained Triage dataset with 439,630 samples, a Fine-grained Diagnosis dataset with 199,600 samples, and a Medical Consultation dataset with 472,418 items, thereby addressing the data shortage in this field. Moreover, we further propose a novel triage system that combines BERT-based supervised learning with prompt learning, as well as a GPT-based medical consultation model using reinforcement learning. To enhance domain knowledge acquisition, we pre-trained PLMs using our self-constructed background corpus. Experimental results on the LCMDC demonstrate the efficacy of our proposed systems. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#569
-
Wang 2022
Military Chain: Construction of Domain Knowledge Graph of Kill Chain Based on Natural Language Model
With the advent of the Big Data era, the specialized data in the kill chain domain has increased dramatically, and the engine-based method of retrieving information can hardly meet the users' need for more accurate answers. The kill chain domain includes four components: control equipment, sensor equipment, strike equipment (weapon and platform), and evaluator equipment, as well as related data which contain a large amount of valuable information such as the parameter information contained in each component. If these fragmented and confusing data are integrated and effective query methods are established, they can help professionals complete the military kill chain knowledge system. The knowledge system constructed in this paper is based on the Neo4j graph database and the US Command simulation system to establish a target-oriented knowledge map of kill chain, aiming to provide data support for the Q&A system. Secondly, in order to facilitate the query, this paper establishes entity and relationship/attribute mining based on the continuous bag-of-words (CBOW) encoding model, bidirectional long short-term memory-conditional random field (BiLSTM-CRF) named entity model, and bidirectional gated recurrent neural network (BiGRU) intent recognition model for Chinese kill chain question and answer; returns the corresponding entity or attribute values in combination with the knowledge graph triad form; and finally constructs the answer return. The constructed knowledge map of the kill chain contains 2767 items (including sea, land, and air), and the number of parameters involved is 30124. The number of model parameters of the deep learning network is 27.9 M for the Q&A system built this time, and the accuracy rate is 85.5% after 200 simulated queries. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1464
-
Wang 2024
KC-GenRe: A Knowledge-constrained Generative Re-ranking Method Based on Large Language Models for Knowledge Graph Completion
2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():9668-9680 European Language Resources Association (ELRA) 2024 Ref ID: 4540 The goal of knowledge graph completion (KGC) is to predict missing facts among entities. Previous methods for KGC re-ranking are mostly built on non-generative language models to obtain the probability of each candidate. Recently, generative large language models (LLMs) have shown outstanding performance on several tasks such as information extraction and dialog systems. Leveraging them for KGC re-ranking is beneficial for leveraging the extensive pre-trained knowledge and powerful generative capabilities. However, it may encounter new problems when accomplishing the task, namely mismatch, misordering and omission. To this end, we introduce KC-GenRe, a knowledge-constrained generative re-ranking method based on LLMs for KGC. To overcome the mismatch issue, we formulate the KGC re-ranking task as a candidate identifier sorting generation problem implemented by generative LLMs. To tackle the misordering issue, we develop a knowledge-guided interactive training method that enhances the identification and ranking of candidates. To address the omission issue, we design a knowledge-augmented constrained inference method that enables contextual prompting and controlled generation, so as to obtain valid rankings. Experimental results show that KG-GenRe achieves state-of-the-art performance on four datasets, with gains of up to 6.7% and 7.7% in the MRR and Hits@1 metric compared to previous methods, and 9.0% and 11.1% compared to that without re-ranking. Extensive analysis demonstrates the effectiveness of components in KG-GenRe. © 2024 ELRA Language Resource Association: CC BY-NC 4.0. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#193
-
Wang 2023
Dynamic Heterogeneous-Graph Reasoning with Language Models and Knowledge Representation Learning for Commonsense Question Answering
61st Annual Meeting of the the Association-for-Computational-Linguistics (ACL) 2023;():14048-14063 Toronto, CANADA Assoc Computational Linguistics-Acl 2023 Ref ID: 3248 Recently, knowledge graphs (KGs) have won noteworthy success in commonsense question answering. Existing methods retrieve relevant subgraphs in the KGs through key entities and reason about the answer with language models (LMs) and graph neural networks. However, they ignore (i) optimizing the knowledge representation and structure of subgraphs and (ii) deeply fusing heterogeneous QA context with subgraphs. In this paper, we propose a dynamic heterogeneous-graph reasoning method with LMs and knowledge representation learning (DHLK), which constructs a heterogeneous knowledge graph (HKG) based on multiple knowledge sources and optimizes the structure and knowledge representation of the HKG using a two-stage pruning strategy and knowledge representation learning (KRL). It then performs joint reasoning by LMs and Relation Mask Self-Attention (RMSA). Specifically, DHLK filters key entities based on the dictionary vocabulary to achieve the first-stage pruning while incorporating the paraphrases in the dictionary into the subgraph to construct the HKG. Then, DHLK encodes and fuses the QA context and HKG using LM, and dynamically removes irrelevant KG entities based on the attention weights of LM for the second-stage pruning. Finally, DHLK introduces KRL to optimize the knowledge representation and perform answer reasoning on the HKG by RMSA. We evaluate DHLK at CommonsenseQA and OpenBookQA, and show its improvement on existing LM and LM+KG methods. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1964
-
Wang 2023
Towards Alleviating the Object Bias in Prompt Tuning-based Factual Knowledge Extraction
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2023;():4420-4432 Association for Computational Linguistics (ACL) 2023 Ref ID: 5178 Many works employed prompt tuning methods to automatically optimize prompt queries and extract the factual knowledge stored in Pretrained Language Models. In this paper, we observe that the optimized prompts, including discrete prompts and continuous prompts, exhibit undesirable object bias. To handle this problem, we propose a novel prompt tuning method called MeCoD consisting of three modules: Prompt Encoder, Object Equalization and Biased Object Obstruction. Experimental results show that MeCoD can significantly reduce the object bias and at the same time improve accuracy of factual knowledge extraction. © 2023 Association for Computational Linguistics. |
yuexi
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#676
-
Wang 2023
Prompt-based Zero-shot Text Classification with Conceptual Knowledge
61st Annual Meeting of the Association-for-Computational-Linguistics / Student Research Workshop (ACL-SRW) 2023;():30-38 Toronto, CANADA Assoc Computational Linguistics-Acl 2023 Ref ID: 3451 In recent years, pre-trained language models have garnered significant attention due to their effectiveness, which stems from the rich knowledge acquired during pre-training. To mitigate the inconsistency issues between pre-training tasks and downstream tasks and to facilitate the resolution of language-related issues, prompt-based approaches have been introduced, which are particularly useful in low-resource scenarios. However, existing approaches mostly rely on verbalizers to translate the predicted vocabulary to task-specific labels. The major limitations of this approach are the ignorance of potentially relevant domain-specific words and being biased by the pre-training data. To address these limitations, we propose a framework that incorporates conceptual knowledge for text classification in the extreme zero-shot setting. The framework includes prompt-based keyword extraction, weight assignment to each prompt keyword, and final representation estimation in the knowledge graph embedding space. We evaluated the method on four widely-used datasets for sentiment analysis and topic detection, demonstrating that it consistently outperforms recently-developed prompt-based approaches in the same experimental settings. |
Ishan
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2046
-
Wang 2024
Zero-Shot Medical Information Retrieval via Knowledge Graph Embedding
Communications in Computer and Information Science 2024;2019 CCIS():29-40 Springer Science and Business Media Deutschland GmbH 2024 DOI: 10.1007/978-3-031-52216-1_3 · Ref ID: 4731 In the era of the Internet of Things (IoT), the retrieval of relevant medical information has become essential for efficient clinical decision-making. This paper introduces MedFusionRank, a novel approach to zero-shot medical information retrieval (MIR) that combines the strengths of pre-trained language models and statistical methods while addressing their limitations. The proposed approach leverages a pre-trained BERT-style model to extract compact yet informative keywords. These keywords are then enriched with domain knowledge by linking them to conceptual entities within a medical knowledge graph. Experimental evaluations on medical datasets demonstrate MedFusionRank’s superior performance over existing methods, with promising results with a variety of evaluation metrics. MedFusionRank demonstrates efficacy in retrieving relevant information, even from short or single-term queries. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024. |
Ishan
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1528
-
Wang 2024
Knowledge-aware Reinforced Language Models for Protein Directed Evolution
Proceedings of Machine Learning Research 2024;235():52260-52273 ML Research Press 2024 Ref ID: 4311 Directed evolution, a cornerstone of protein optimization, is to harness natural mutational processes to enhance protein functionality. Existing Machine Learning-assisted Directed Evolution (MLDE) methodologies typically rely on data-driven strategies and often overlook the profound domain knowledge in biochemical fields. In this paper, we introduce a novel Knowledge-aware Reinforced Language Model (KnowRLM) for MLDE. An Amino Acid Knowledge Graph (AAKG) is constructed to represent the intricate biochemical relationships among amino acids. We further propose a Protein Language Model (PLM)-based policy network that iteratively samples mutants through preferential random walks on the AAKG using a dynamic sliding window mechanism. The novel mutants are actively sampled to fine-tune a fitness predictor as the reward model, providing feedback to the knowledge-aware policy. Finally, we optimize the whole system in an active learning approach that mimics biological settings in practice. KnowRLM stands out for its ability to utilize contextual amino acid information from knowledge graphs, thus attaining advantages from both statistical patterns of protein sequences and biochemical properties of amino acids. Extensive experiments demonstrate the superior performance of KnowRLM in more efficiently identifying high-fitness mutants compared to existing methods. Copyright 2024 by the author(s) |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3958
-
Wang 2024
Unveiling Factual Recall Behaviors of Large Language Models through Knowledge Neurons
arXiv 2024;(): 2024 Ref ID: 8513 In this paper, we investigate whether Large Language Models (LLMs) actively recall or retrieve their internal repositories of factual knowledge when faced with reasoning tasks. Through an analysis of LLMs' internal factual recall at each reasoning step via Knowledge Neurons, we reveal that LLMs fail to harness the critical factual associations under certain circumstances. Instead, they tend to opt for alternative, shortcut-like pathways to answer reasoning questions. By manually manipulating the recall process of parametric knowledge in LLMs, we demonstrate that enhancing this recall process directly improves reasoning performance whereas suppressing it leads to notable degradation. Furthermore, we assess the effect of Chain-of-Thought (CoT) prompting, a powerful technique for addressing complex reasoning tasks. Our findings indicate that CoT can intensify the recall of factual knowledge by encouraging LLMs to engage in orderly and reliable reasoning. Furthermore, we explored how contextual conflicts affect the retrieval of facts during the reasoning process to gain a comprehensive understanding of the factual recall behaviors of LLMs. Code and data will be available soon. |
Mike
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3783
-
Wang 2024
Prometheus Chatbot: Knowledge Graph Collaborative Large Language Model for Computer Components Recommendation
arXiv 2024;(): 2024 Ref ID: 8491 Knowledge graphs (KGs) are essential in applications such as network alignment, question-answering, and recommender systems (RSs) since they offer structured relational data that facilitate the inference of indirect relationships. However, the development of KG-based RSs capable of processing user inputs in natural language faces significant challenges. Firstly, natural language processing units must effectively handle the ambiguity and variability in human language to interpret user intents accurately. Secondly, the system must precisely identify and link entities, like product names, to their corresponding nodes in KGs. To overcome these challenges, supported by Lenovo, we developed a novel chatbot called "Prometheus," which integrates a KG with a large language model (LLM), specifically designed for recommending computer components. This chatbot can accurately decode user requests and deliver personalized recommendations derived from KGs, ensuring precise comprehension and response to their computer setup needs. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#984
-
Wang 2023
AMD Results for OAEI 2023
CEUR Workshop Proceedings 2023;3591():146-153 CEUR-WS 2023 Ref ID: 5044 AgreementMakerDeep (AMD) is a new flexible and extensible ontology matching system. It exploits the contextual and structural information of ontologies by infusing knowledge to pre-trained masked language model, and then filter the output mappings using knowledge graph embedding techniques. AMD learns from classes and their relations between classes by constructing vector representations into the low dimensional embedding space with knowledge graph embedding methods. The results demonstrate that AMD achieves a competitive performance in many OAEI tracks, but AMD has limitations for property and instance matching. © 2023 Copyright for this paper by its authors. |
Xinchen
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2293
-
Wang 2023
CRule: Category-Aware Symbolic Multihop Reasoning on Knowledge Graphs
Multihop reasoning is essential in knowledge graph (KG) research and applications. Current methods rely on specific KG entities, while human cognition operates at a more abstract level. This article proposes a category-aware rule-based (CRule) approach for symbolic multihop reasoning. Specifically, given a KG, CRule first categorizes entities and constructs a category-aware KG; it then uses rules retrieved from the categorized KG to perform multihop reasoning on the original KG. Experiments on five datasets show that CRule is simple, is effective, and combines the advantages of symbolic and neural network methods. It overcomes symbolic reasoning’s complexity limitations, can perform reasoning on KGs of more than 300,000 edges, and can be three times more efficient than neural network models. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1222
-
Wang 2024
ECoK: Emotional Commonsense Knowledge Graph for Mining Emotional Gold
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():8055-8074 Association for Computational Linguistics (ACL) 2024 Ref ID: 4237 The demand for understanding and expressing emotions in the field of natural language processing is growing rapidly. Knowledge graphs, as an important form of knowledge representation, have been widely utilized in various emotion-related tasks. However, existing knowledge graphs mainly focus on the representation and reasoning of general factual knowledge, while there are still significant deficiencies in the understanding and reasoning of emotional knowledge. In this work, we construct a comprehensive and accurate emotional commonsense knowledge graph, ECoK. We integrate cutting-edge theories from multiple disciplines such as psychology, cognitive science, and linguistics, and combine techniques such as large language models and natural language processing. By mining a large amount of text, dialogue, and sentiment analysis data, we construct rich emotional knowledge and establish the knowledge generation model COMET-ECoK. Experimental results show that ECoK contains high-quality emotional reasoning triples, and the performance of our knowledge generation model surpasses GPT-4-Turbo, which can help downstream tasks better understand and reason about emotions. Our data and code is available from https://github.com/ZornWang/ECoK. © 2024 Association for Computational Linguistics. |
Mike
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1839
-
Wang 2024
RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():14743-14777 Association for Computational Linguistics (ACL) 2024 Ref ID: 4225 The advent of Large Language Models (LLMs) has paved the way for complex tasks such as role-playing, which enhances user interactions by enabling models to imitate various characters. However, the closed-source nature of state-of-the-art LLMs and their general-purpose training limit role-playing optimization. In this paper, we introduce RoleLLM, a framework to benchmark, elicit, and enhance role-playing abilities in LLMs. RoleLLM comprises four stages: (1) Role Profile Construction for 100 roles; (2) Context-Based Instruction Generation (Context-Instruct) for role-specific knowledge extraction; (3) Role Prompting using GPT (RoleGPT) for speaking style imitation; and (4) Role-Conditioned Instruction Tuning (RoCIT) for fine-tuning open-source models along with role customization. By Context-Instruct and RoleGPT, we create RoleBench, the first systematic and fine-grained character-level benchmark dataset for role-playing with 168,093 samples. Moreover, RoCIT on RoleBench yields RoleLLaMA (English) and RoleGLM (Chinese), significantly enhancing role-playing abilities and even achieving comparable results with RoleGPT (using GPT-4). © 2024 Association for Computational Linguistics. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1245
-
Wang 2024
Enhance Large Language Models for Multilingual Sentence Embedding with Knowledge Graph
Proceedings of the International Joint Conference on Neural Networks 2024;(): Institute of Electrical and Electronics Engineers Inc. 2024 DOI: 10.1109/IJCNN60899.2024.10650221 · Ref ID: 4282 Sentence representation is a major challenge in natural language processing, especially in multilingual environments. Current approaches to sentence representation using Large Language Models (LLMs) often require large amounts of data for fine-tuning, and research has focused on English content. In addition, comparative datasets translated directly from English can contain many semantic and syntactic errors. To address these issues, we propose a new approach to enhance multilingual sentence embeddings using LLMs and knowledge graphs. We first present a dedicated designed prompt that exploits in-context learning of LLMs for sentence embedding without fine-tuning. We further introduce an innovative method that utilizes knowledge graphs, such as Wikidata, for generating diverse multilingual training data for contrastive finetuning. This approach significantly reduces the reliance on translated sentences and mitigates issues related to translation accuracy. Furthermore, we develop a unique multilingual contrastive learning loss function, which, when combined with QLora's efficient fine-tuning technique, enables LLMs to achieve state-of-the-art performance in Sentence Text Similarity (STS) tasks, even with limited computational resources. © 2024 IEEE. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1382
-
Wasi 2024
HRGraph: Leveraging LLMs for HR Data Knowledge Graphs with Information Propagation-based Job Recommendation
KaLLM 2024 - 1st Workshop on Knowledge Graphs and Large Language Models, Proceedings of the Workshop 2024;():56-62 Association for Computational Linguistics (ACL) 2024 Ref ID: 4372 Knowledge Graphs (KGs) serving as semantic networks, prove highly effective in managing complex interconnected data in different domains, by offering a unified, contextualized, and structured representation with flexibility that allows for easy adaptation to evolving knowledge. Processing complex Human Resources (HR) data, KGs can help in different HR functions like recruitment, job matching, identifying learning gaps, and enhancing employee retention. Despite their potential, limited efforts have been made to implement practical HR knowledge graphs. This study addresses this gap by presenting a framework for effectively developing HR knowledge graphs from documents using Large Language Models. The resulting KG can be used for a variety of downstream tasks, including job matching, identifying employee skill gaps, and many more. In this work, we showcase instances where HR KGs prove instrumental in precise job matching, yielding advantages for both employers and employees. Empirical evidence from experiments with information propagation in KGs and Graph Neural Nets, along with case studies underscores the effectiveness of KGs in tasks such as job and employee recommendations and job area classification. ©2024 Association for Computational Linguistics. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1481
-
Wei 2023
KICGPT: Large Language Model with Knowledge in Context for Knowledge Graph Completion
Findings of the Association for Computational Linguistics: EMNLP 2023 2023;():8667-8683 Association for Computational Linguistics (ACL) 2023 Ref ID: 5103 Knowledge Graph Completion (KGC) is crucial for addressing knowledge graph incompleteness and supporting downstream applications. Many models have been proposed for KGC. They can be categorized into two main classes: triple-based and text-based approaches. Triple-based methods struggle with long-tail entities due to limited structural information and imbalanced entity distributions. Text-based methods alleviate this issue but require costly training for language models and specific finetuning for knowledge graphs, which limits their efficiency. To alleviate these limitations, in this paper, we propose KICGPT, a framework that integrates a large language model (LLM) and a triple-based KGC retriever. It alleviates the long-tail problem without incurring additional training overhead. KICGPT uses an in-context learning strategy called Knowledge Prompt, which encodes structural knowledge into demonstrations to guide the LLM. Empirical results on benchmark datasets demonstrate the effectiveness of KICGPT with smaller training overhead and no finetuning. © 2023 Association for Computational Linguistics. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#349
-
Wei 2023
Improving Bug Severity Prediction With Domain-Specific Representation Learning
Automating the process of bug severity assignment can accelerate bug triagers' efficiency in the life-cycle of software maintenance, improving the quality of software products. The mainstream approaches for bug severity prediction mainly use different neural networks due to their automated learning ability. However, there are two problems that make existing approaches fail to predict severities for some bugs: 1) they cannot learn the internal knowledge of bug reports; 2) supervised training is difficult to understand the global context of bug reports. To resolve these two problems, in this paper, we propose a bug severity prediction approach, namely KICL, which combines pre-trained language models and domain-specific pre-training strategies, i.e., Knowledge-Intensified pre-training and contrastive learning pre-training. Specifically, Knowledge-Intensified allows KICL to learn project-specific bug report tokens, deeply understanding internal knowledge of bug reports. As for contrastive learning, it allows KICL to perform sequence-level learning, understanding bug reports from the perspective of the global context. When finishing pre-training, we can fine-tune pre-trained KICL for bug severity prediction. To evaluate the effectiveness of KICL, we choose six baseline approaches and compare their performance on a public dataset. The experimental results show that KICL outperforms all baseline approaches by up to 30.68% in terms of weighted average F1-score, achieving new results for bug severity prediction. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3347
-
Wei 2024
Does Knowledge Localization Hold True? Surprising Differences Between Entity and Relation Perspectives in Language Models
arXiv 2024;(): 2024 Ref ID: 8573 Large language models encapsulate knowledge and have demonstrated superior performance on various natural language processing tasks. Recent studies have localized this knowledge to specific model parameters, such as the MLP weights in intermediate layers. This study investigates the differences between entity and relational knowledge through knowledge editing. Our findings reveal that entity and relational knowledge cannot be directly transferred or mapped to each other. This result is unexpected, as logically, modifying the entity or the relation within the same knowledge triplet should yield equivalent outcomes. To further elucidate the differences between entity and relational knowledge, we employ causal analysis to investigate how relational knowledge is stored in pre-trained models. Contrary to prior research suggesting that knowledge is stored in MLP weights, our experiments demonstrate that relational knowledge is also significantly encoded in attention modules. This insight highlights the multifaceted nature of knowledge storage in language models, underscoring the complexity of manipulating specific types of knowledge within these models. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2097
-
Weihong 2004
Adding context-awareness to knowledge management in modern enterprises
2004 2nd International IEEE Conference on 'Intelligent Systems'. Proceedings (IEEE Cat. No.04EX791) 2004;2():393-398 Vol.2 2004 DOI: 10.1109/IS.2004.1344779 · Ref ID: 6072 To reduce the negative impact of knowledge loss and to improve knowledge reuse effectiveness in knowledge management in modern enterprises, This work presents a context-aware approach to facilitate managing various types of static enterprise information and dynamic process information. Proposed approach features representing and integrating information at different conceptual levels to present contextual knowledge in an open environment. In this paper, we redefine the concept of context in intelligent systems and propose a set of meta-information elements for context description in business environments. In realising the context-awareness in knowledge management, we present a context knowledge structure model and look into the corresponding context knowledge storage and reuse solutions. To enhance context-aware knowledge management for e-businesses over the global network, we introduce a new concept of context knowledge grid with a layered knowledge interoperation reference model, which are supposed to leverage the contextual knowledge in modern enterprises and enable interoperation with other knowledge frameworks such as the semantic Web and the semantic grid. |
Mike
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1254
-
Wen 2024
Enhancing Fault Troubleshooting through Human-Machine Collaboration: A Multi-Stage Reasoning Approach
IEEE International Conference on Automation Science and Engineering 2024;():460-467 IEEE Computer Society 2024 DOI: 10.1109/CASE59546.2024.10711734 · Ref ID: 4112 Ensuring the stable operation of critical industrial equipment is pivotal for maintaining production efficiency and economic gains. The complexity of modern industrial machinery, however, places a substantial cognitive load on maintenance personnel. To alleviate this, a Diagnostic Semantic-Enhanced Fault Causality Knowledge Graph (DSFCKG) is proposed to formalize fault information for computational analysis. Additionally, a Large Language Model (LLM)-based Knowledge Graph Construction (KGC) method is introduced for the automated assembly of DSFCKG. Building upon this, a multi-stage reasoning approach is designed for human-machine collaborative fault Troubleshooting. Experiments on real-world fault tickets demonstrate that our proposed method significantly enhances fault diagnosis and troubleshooting accuracy, especially in complex scenarios with long fault causal chains, which bring insights into futuristic smart maintenance. © 2024 IEEE. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1670
-
Wen 2021
Named Entity Recognition for Instructions of Chinese Medicine Based on Pre-trained Language Model
Proceedings - 2021 3rd International Conference on Natural Language Processing, ICNLP 2021 2021;():139-144 Institute of Electrical and Electronics Engineers Inc. 2021 DOI: 10.1109/ICNLP52887.2021.00029 · Ref ID: 5552 Named Entity Recognition (NER) of Chinese medicine text is a basic task of constructing medical and health knowledge graph. Many scholars have researched the NER task of electronic medical records and drug names, while many factors restrict the research of NER tasks for the instructions of Chinese medicine. For example, there is no obvious boundary between words in Chinese, and it is impossible to capture the interactive information between sentences and the global information at the same time. Considering that this type of data is highly professional and there is no publicly available data set. This paper collected 1,000 pieces of instructions of Chinese medicine, then explored the effectiveness of pre-trained models in NER task in this field. The experimental results showed that compared with the experimental results of the single or joint model on the same data set, the F1 value of pre-trained model was increased by 9.65% and 8.71% respectively. © 2021 IEEE. |
Kwesi
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2969
-
Weng 2012
Symbolic Models and Emergent Models: A Review
IEEE Transactions on Autonomous Mental Development 2012;4(1):29-53 2012 DOI: 10.1109/TAMD.2011.2159113 · Ref ID: 6096 There exists a large conceptual gap between symbolic models and emergent models for the mind. Many emergent models work on low-level sensory data, while many symbolic models deal with high-level abstract (i.e., action) symbols. There has been relatively little study on intermediate representations, mainly because of a lack of knowledge about how representations fully autonomously emerge inside the closed brain skull, using information from the exposed two ends (the sensory end and the motor end). As reviewed here, this situation is changing. A fundamental challenge for emergent models is abstraction, which symbolic models enjoy through human handcrafting. The term abstract refers to properties disassociated with any particular form. Emergent abstraction seems possible, although the brain appears to never receive a computer symbol (e.g., ASCII code) or produce such a symbol. This paper reviews major agent models with an emphasis on representation. It suggests two different ways to relate symbolic representations with emergent representations: One is based on their categorical definitions. The other considers that a symbolic representation corresponds to a brain's outside behaviors observed and handcrafted by other outside human observers; but an emergent representation is inside the brain. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1588
-
White 2023
Leveraging Explicit Procedural Instructions for Data-Efficient Action Prediction
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2023;():2895-2904 Association for Computational Linguistics (ACL) 2023 Ref ID: 5184 Task-oriented dialogues often require agents to enact complex, multi-step procedures in order to meet user requests. While large language models have found success automating these dialogues in constrained environments, their widespread deployment is limited by the substantial quantities of task-specific data required for training. The following paper presents a data-efficient solution to constructing dialogue systems, leveraging explicit instructions derived from agent guidelines, such as company policies or customer service manuals. Our proposed Knowledge-Augmented Dialogue System (KADS) combines a large language model with a knowledge retrieval module that pulls documents outlining relevant procedures from a predefined set of policies, given a user-agent interaction. To train this system, we introduce a semi-supervised pre-training scheme that employs dialogue-document matching and action-oriented masked language modeling with partial parameter freezing. We evaluate the effectiveness of our approach on prominent task-oriented dialogue datasets, Action-Based Conversations Dataset and Schema-Guided Dialogue, for two dialogue tasks: action state tracking and workflow discovery. Our results demonstrate that procedural knowledge augmentation improves accuracy predicting in- and out-of-distribution actions while preserving high performance in settings with low or sparse data. © 2023 Association for Computational Linguistics. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1162
-
Winter 2024
DDxGym: Online Transformer Policies in a Knowledge Graph Based Natural Language Environment
2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():4438-4448 European Language Resources Association (ELRA) 2024 Ref ID: 4573 Differential diagnosis (DDx) is vital for physicians and challenging due to the existence of numerous diseases and their complex symptoms. Model training for this task is generally hindered by limited data access due to privacy concerns. To address this, we present DDxGym, a specialized OpenAI Gym environment for clinical differential diagnosis. DDxGym formulates DDx as a natural-language-based reinforcement learning (RL) problem, where agents emulate medical professionals, selecting examinations and treatments for patients with randomly sampled diseases. This RL environment utilizes data labeled from online resources, evaluated by medical professionals for accuracy. Transformers, while effective for encoding text in DDxGym, are unstable in online RL. For that reason we propose a novel training method using an auxiliary masked language modeling objective for policy optimization, resulting in model stabilization and significant performance improvement over strong baselines. Following this approach, our agent effectively navigates large action spaces and identifies universally applicable actions. All data, environment details, and implementation, including experiment reproduction code, are made publicly available. © 2024 ELRA Language Resource Association: CC BY-NC 4.0. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#263
-
Wu 2024
Exploring the reversal curse and other deductive logical reasoning in BERT and GPT-based large language models
The "Reversal Curse"describes the inability of autoregressive decoder large language models (LLMs) to deduce "B is A"from "A is B,"assuming that B and A are distinct and can be uniquely identified from each other. This logical failure suggests limitations in using generative pretrained transformer (GPT) models for tasks like constructing knowledge graphs. Our study revealed that a bidirectional LLM, bidirectional encoder representations from transformers (BERT), does not suffer from this issue. To investigate further, we focused on more complex deductive reasoning by training encoder and decoder LLMs to perform union and intersection operations on sets. While both types of models managed tasks involving two sets, they struggled with operations involving three sets. Our findings underscore the differences between encoder and decoder models in handling logical reasoning. Thus, selecting BERT or GPT should depend on the task's specific needs, utilizing BERT's bidirectional context comprehension or GPT's sequence prediction strengths. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1059
-
Wu 2023
Chain of Thought Prompting Elicits Knowledge Augmentation
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2023;():6519-6534 Association for Computational Linguistics (ACL) 2023 Ref ID: 5220 The knowledge-augmented deep learning paradigm refers to a paradigm in which domain knowledge is identified and integrated into deep models. Conventional methods typically employ task-specific approaches to gather external knowledge from various sources. In contrast, large language models are extensively pre-trained and can serve as a comprehensive source of external knowledge. In this paper, we propose CoT-KA, a Chain-of-Thought-based method that augments knowledge for deep learning. CoT-KA avoids the need for additional knowledge retrieval or knowledge reasoning models, as required in conventional augmentation methods. Our results demonstrate that CoT-KA outperforms both pure CoT-based methods and the non-augmented method across the majority of eleven publicly available benchmarks for various reasoning tasks. © 2023 Association for Computational Linguistics. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1119
-
Wu 2023
CONIC10K: A Challenging Math Problem Understanding and Reasoning Dataset
Findings of the Association for Computational Linguistics: EMNLP 2023 2023;():6444-6458 Association for Computational Linguistics (ACL) 2023 Ref ID: 5084 Mathematical understanding and reasoning are crucial tasks for assessing the capabilities of artificial intelligence (AI). However, existing benchmarks either require just a few steps of reasoning, or only contain a small amount of data in one specific topic, making it hard to analyse AI's behaviour with reference to different problems within a specific topic in detail. In this work, we propose CONIC10K, a challenging math problem dataset on conic sections in Chinese senior high school education. Our dataset contains various problems with different reasoning depths, while only the knowledge from conic sections is required. Since the dataset only involves a narrow range of knowledge, it is easy to separately analyse the knowledge a model possesses and the reasoning ability it has. For each problem, we provide a high-quality formal representation, the reasoning steps, and the final solution. Experiments show that existing large language models, including GPT-4, exhibit weak performance on complex reasoning. We hope that our findings could inspire more advanced techniques for precise natural language understanding and reasoning. Our dataset and codes are available at https://github.com/whyNLP/Conic10K. © 2023 Association for Computational Linguistics. |
Kwesi
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1090
-
Wu 2024
COKE: A Cognitive Knowledge Graph for Machine Theory of Mind
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;1():15984-16007 Association for Computational Linguistics (ACL) 2024 Ref ID: 4364 Theory of mind (ToM) refers to humans' ability to understand and infer the desires, beliefs, and intentions of others. The acquisition of ToM plays a key role in humans' social cognition and interpersonal relations. Though indispensable for social intelligence, ToM is still lacking for modern AI and NLP systems since they cannot access the human mental state and cognitive process beneath the training corpus. To empower AI systems with the ToM ability and narrow the gap between them and humans, in this paper, we propose COKE: the first cognitive knowledge graph for machine theory of mind, formalizing cognitive processes as a chained structure. Specifically, COKE formalizes ToM as a collection of 45k+ manually verified cognitive chains that characterize human mental activities and subsequent behavioral/affective responses when facing specific social circumstances. In addition, we further generalize COKE using LLMs and build a powerful generation model COLM tailored for cognitive reasoning. Experimental results in both automatic and human evaluation demonstrate the high quality of COKE, the superior ToM ability of COLM, and its potential to significantly enhance social applications. We release our code and data at https://github.com/jincenziwu/COKE. © 2024 Association for Computational Linguistics. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1818
-
Wu 2024
Research Progress on Digitalization and Intelligence in Food Domain Based on Knowledge Graphs
With the development of technologies such as big data and cloud computing, the scale of data in the food domain is growing at an astonishing rate. These data not only come from diverse sources and have complex structures, but also lack standardized terminology, which poses challenges to the effective integration and utilization of food-related data. Knowledge graphs, as fundamental cornerstone of achieving general artificial intelligence, provides support for the organization and management of food data and its higher-level applications in terms of integration and semantic understanding. By summarizing recent research achievements of knowledge graphs in the food domain, the construction methods of knowledge graphs in food domain was reviewed, covering key steps such as ontology construction, knowledge extraction, knowledge fusion, and processing. The current applications of knowledge graphs in the food domain, particularly in three areas, food nutrition and health, food innovation and research, and food safety and traceability. Based on current state of development of knowledge graphs in the food domain, incorporating multimodal data fusion technology, large language model construction, and the intelligentization of industrial equipment in the food field, the future development directions of knowledge graphs in food domain were anticipated. © 2024 Beijing Technology and Business University, Department of Science and Technology. All rights reserved. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#312
-
Wu 2024
Geospatial Big Data: Survey and Challenges
IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 2024;17():17007-17020 2024 DOI: 10.1109/jstars.2024.3438376 · Ref ID: 3401 In recent years, geospatial big data (GBD) has obtained attention across various disciplines, categorized into big Earth observation data and big human behavior data. Identifying geospatial patterns from GBD has been a vital research focus in the fields of urban management and environmental sustainability. This article reviews the evolution of GBD mining and its integration with advanced artificial intelligence techniques. GBD consists of data generated by satellites, sensors, mobile devices, and geographical information systems, and we categorize geospatial data based on different perspectives. We outline the process of GBD mining and demonstrate how it can be incorporated into a unified framework. In addition, we explore new technologies, such as large language models, the metaverse, and knowledge graphs, and how they could make GBD even more useful. We also share examples of GBD helping with city management and protecting the environment. Finally, we discuss the real challenges that come up when working with GBD, such as issues with data retrieval and security. Our goal is to give readers a clear view of where GBD mining stands today and where it might go next. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2044
-
Wu 2023
Zero-Shot Construction of Chinese Medical Knowledge Graph with ChatGPT
Proceedings - 2023 1st IEEE International Conference on Medical Artificial Intelligence, MedAI 2023 2023;():278-283 Institute of Electrical and Electronics Engineers Inc. 2023 DOI: 10.1109/MedAI59581.2023.00043 · Ref ID: 4959 Knowledge graphs have revolutionized the organization and retrieval of real-world knowledge, prompting inter-est in automatic NLP-based approaches for extracting medical knowledge from texts. However, the availability of high-quality Chinese medical knowledge remains limited, posing challenges for constructing Chinese medical knowledge graphs. As LLMs like ChatGPT show promise in zero-shot learning for many NLP downstream tasks, their potential on constructing Chinese medical knowledge graphs is still uncertain. In this study, we create a Chinese medical knowledge graph by manually annotating textual data and using ChatGPT to automatically generate the graph. We refine the results using filtering and mapping rules to align with our schema. The manually generated graph serves as the ground truth for evaluation, and we explore different methods to enhance its accuracy through knowledge graph completion techniques. As a result, we emphasize the potential of employing ChatGPT for automated knowledge graph construction within the Chinese medical domain. While ChatGPT successfully identifies a larger number of entities, further en-hancements are required to improve its performance in extracting more qualified relations. © 2023 IEEE. |
Kwesi
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1807
-
Wu 2023
Research on Intelligent Question-Answering Systems Based on Large Language Models and Knowledge Graphs
Proceedings - 2023 16th International Symposium on Computational Intelligence and Design, ISCID 2023 2023;():161-164 Institute of Electrical and Electronics Engineers Inc. 2023 DOI: 10.1109/ISCID59865.2023.00045 · Ref ID: 4938 With the continuous development of artificial intelligence and cloud computing technologies, the emergence of large language models (LLMs) has created new opportunities for intelligent applications. However, large language models may lack authenticity and accuracy when providing answers in specific professional domains, and they even generate "illusory facts." In response to the limitations of current large language models in solving specific professional fields, this paper proposes to use large language models and knowledge graph technology to construct an intelligent question answering system for specific fields. Through systematic training and optimization, efficient domain specific knowledge Q&A has been achieved, improving the satisfaction rate of domain specific knowledge Q&A. The intelligent question answering system based on large models and knowledge graphs brings more convenience to people's lives and work, which is beneficial for users to obtain intelligent solutions in fields such as education, healthcare, and customer service. ©2023 IEEE. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1173
-
Wu 2023
DEPN: Detecting and Editing Privacy Neurons in Pretrained Language Models
EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings 2023;():2875-2886 Association for Computational Linguistics (ACL) 2023 Ref ID: 5100 Large language models pretrained on a huge amount of data capture rich knowledge and information in the training data. The ability of data memorization and regurgitation in pretrained language models, revealed in previous studies, brings the risk of data leakage. In order to effectively reduce these risks, we propose a framework DEPN to Detect and Edit Privacy Neurons in pretrained language models, partially inspired by knowledge neurons and model editing. In DEPN, we introduce a novel method, termed as privacy neuron detector, to locate neurons associated with private information, and then edit these detected privacy neurons by setting their activations to zero. Furthermore, we propose a privacy neuron aggregator dememorize private information in a batch processing manner. Experimental results show that our method can significantly and efficiently reduce the exposure of private data leakage without deteriorating the performance of the model. Additionally, we empirically demonstrate the relationship between model memorization and privacy neurons, from multiple perspectives, including model size, training time, prompts, privacy neuron distribution, illustrating the robustness of our approach. ©2023 Association for Computational Linguistics. |
Davis
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2076
-
Wu 2024
reguloGPT: Harnessing GPT for Knowledge Graph Construction of Molecular Regulatory Pathways
MOTIVATION: Molecular Regulatory Pathways (MRPs) are crucial for understanding biological functions. Knowledge Graphs (KGs) have become vital in organizing and analyzing MRPs, providing structured representations of complex biological interactions. Current tools for mining KGs from biomedical literature are inadequate in capturing complex, hierarchical relationships and contextual information about MRPs. Large Language Models (LLMs) like GPT-4 offer a promising solution, with advanced capabilities to decipher the intricate nuances of language. However, their potential for end-to-end KG construction, particularly for MRPs, remains largely unexplored. RESULTS: We present reguloGPT, a novel GPT-4 based in-context learning prompt, designed for the end-to-end joint name entity recognition, N-ary relationship extraction, and context predictions from a sentence that describes regulatory interactions with MRPs. Our reguloGPT approach introduces a context-aware relational graph that effectively embodies the hierarchical structure of MRPs and resolves semantic inconsistencies by embedding context directly within relational edges. We created a benchmark dataset including 400 annotated PubMed titles on N6-methyladenosine (m(6)A) regulations. Rigorous evaluation of reguloGPT on the benchmark dataset demonstrated marked improvement over existing algorithms. We further developed a novel G-Eval scheme, leveraging GPT-4 for annotation-free performance evaluation and demonstrated its agreement with traditional annotation-based evaluations. Utilizing reguloGPT predictions on m(6)A-related titles, we constructed the m(6)A-KG and demonstrated its utility in elucidating m(6)A's regulatory mechanisms in cancer phenotypes across various cancers. These results underscore reguloGPT's transformative potential for extracting biological knowledge from the literature. AVAILABILITY AND IMPLEMENTATION: The source code of reguloGPT, the m(6)A title and benchmark datasets, and m(6)A-KG are available at: https://github.com/Huang-AI4Medicine-Lab/reguloGPT. |
Davis
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3842
-
Wu 2023
Retrieve-Rewrite-Answer: A KG-to-Text Enhanced LLMs Framework for Knowledge Graph Question Answering
arXiv 2023;(): 2023 Ref ID: 7836 Despite their competitive performance on knowledge-intensive tasks, large language models (LLMs) still have limitations in memorizing all world knowledge especially long tail knowledge. In this paper, we study the KG-augmented language model approach for solving the knowledge graph question answering (KGQA) task that requires rich world knowledge. Existing work has shown that retrieving KG knowledge to enhance LLMs prompting can significantly improve LLMs performance in KGQA. However, their approaches lack a well-formed verbalization of KG knowledge, i.e., they ignore the gap between KG representations and textual representations. To this end, we propose an answer-sensitive KG-to-Text approach that can transform KG knowledge into well-textualized statements most informative for KGQA. Based on this approach, we propose a KG-to-Text enhanced LLMs framework for solving the KGQA task. Experiments on several KGQA benchmarks show that the proposed KG-to-Text augmented LLMs approach outperforms previous KG-augmented LLMs approaches regarding answer accuracy and usefulness of knowledge statements. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3299
-
Wu 2024
CoTKR: Chain-of-Thought Enhanced Knowledge Rewriting for Complex Knowledge Graph Question Answering
arXiv 2024;(): 2024 Ref ID: 8638 Recent studies have explored the use of Large Language Models (LLMs) with Retrieval Augmented Generation (RAG) for Knowledge Graph Question Answering (KGQA). They typically require rewriting retrieved subgraphs into natural language formats comprehensible to LLMs. However, when tackling complex questions, the knowledge rewritten by existing methods may include irrelevant information, omit crucial details, or fail to align with the question's semantics. To address them, we propose a novel rewriting method CoTKR, Chain-of-Thought Enhanced Knowledge Rewriting, for generating reasoning traces and corresponding knowledge in an interleaved manner, thereby mitigating the limitations of single-step knowledge rewriting. Additionally, to bridge the preference gap between the knowledge rewriter and the question answering (QA) model, we propose a training strategy PAQAF, Preference Alignment from Question Answering Feedback, for leveraging feedback from the QA model to further optimize the knowledge rewriter. We conduct experiments using various LLMs across several KGQA benchmarks. Experimental results demonstrate that, compared with previous knowledge rewriting methods, CoTKR generates the most beneficial knowledge representation for QA models, which significantly improves the performance of LLMs in KGQA. |
Kwesi
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3746
-
Wu 2023
Online Continual Knowledge Learning for Language Models
arXiv 2023;(): 2023 Ref ID: 7948 Large Language Models (LLMs) serve as repositories of extensive world knowledge, enabling them to perform tasks such as question-answering and fact-checking. However, this knowledge can become obsolete as global contexts change. In this paper, we introduce a novel problem in the realm of continual learning: Online Continual Knowledge Learning (OCKL). This problem formulation aims to manage the dynamic nature of world knowledge in LMs under real-time constraints. We propose a new benchmark and evaluation metric designed to measure both the rate of new knowledge acquisition and the retention of previously learned knowledge. Our empirical evaluation, conducted using a variety of state-of-the-art methods, establishes robust base-lines for OCKL. Our results reveal that existing continual learning approaches are unfortunately insufficient for tackling the unique challenges posed by OCKL. We identify key factors that influence the trade-off between knowledge acquisition and retention, thereby advancing our understanding of how to train LMs in a continually evolving environment. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3574
-
Wu 2024
KGV: Integrating Large Language Models with Knowledge Graphs for Cyber Threat Intelligence Credibility Assessment
arXiv 2024;(): 2024 Ref ID: 8536 Cyber threat intelligence is a critical tool that many organizations and individuals use to protect themselves from sophisticated, organized, persistent, and weaponized cyber attacks. However, few studies have focused on the quality assessment of threat intelligence provided by intelligence platforms, and this work still requires manual analysis by cybersecurity experts. In this paper, we propose a knowledge graph-based verifier, a novel Cyber Threat Intelligence (CTI) quality assessment framework that combines knowledge graphs and Large Language Models (LLMs). Our approach introduces LLMs to automatically extract OSCTI key claims to be verified and utilizes a knowledge graph consisting of paragraphs for fact-checking. This method differs from the traditional way of constructing complex knowledge graphs with entities as nodes. By constructing knowledge graphs with paragraphs as nodes and semantic similarity as edges, it effectively enhances the semantic understanding ability of the model and simplifies labeling requirements. Additionally, to fill the gap in the research field, we created and made public the first dataset for threat intelligence assessment from heterogeneous sources. To the best of our knowledge, this work is the first to create a dataset on threat intelligence reliability verification, providing a reference for future research. Experimental results show that KGV (Knowledge Graph Verifier) significantly improves the performance of LLMs in intelligence quality assessment. Compared with traditional methods, we reduce a large amount of data annotation while the model still exhibits strong reasoning capabilities. Finally, our method can achieve XXX accuracy in network threat assessment. |
Ishan
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3362
-
Xi 2024
Efficient and Deployable Knowledge Infusion for Open-World Recommendations via Large Language Models
arXiv 2024;(): 2024 Ref ID: 8547 Recommender systems (RSs) play a pervasive role in today's online services, yet their closed-loop nature constrains their access to open-world knowledge. Recently, large language models (LLMs) have shown promise in bridging this gap. However, previous attempts to directly implement LLMs as recommenders fall short in meeting the requirements of industrial RSs, particularly in terms of online inference latency and offline resource efficiency. Thus, we propose REKI to acquire two types of external knowledge about users and items from LLMs. Specifically, we introduce factorization prompting to elicit accurate knowledge reasoning on user preferences and items. We develop individual knowledge extraction and collective knowledge extraction tailored for different scales of scenarios, effectively reducing offline resource consumption. Subsequently, generated knowledge undergoes efficient transformation and condensation into augmented vectors through a hybridized expert-integrated network, ensuring compatibility. The obtained vectors can then be used to enhance any conventional recommendation model. We also ensure efficient inference by preprocessing and prestoring the knowledge from LLMs. Experiments demonstrate that REKI outperforms state-of-the-art baselines and is compatible with lots of recommendation algorithms and tasks. Now, REKI has been deployed to Huawei's news and music recommendation platforms and gained a 7% and 1.99% improvement during the online A/B test. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1856
-
Xia 2023
Secure Co-Creation of Industrial Knowledge Graph: Graph Complement Method with Federated Learning and ChatGPT
IEEE International Conference on Automation Science and Engineering 2023;2023-August(): IEEE Computer Society 2023 DOI: 10.1109/CASE56687.2023.10260382 · Ref ID: 5241 Industrial areas have increasingly developed their own Knowledge Graph (KG) for organizing and leveraging vast amounts of data. One major challenge in constructing KG is the heavy reliance on available resources, restricting the scalability and accuracy of the resulting graphs. To address this issue, an end-to-end method is proposed to create a multi-benefit ecosystem by integrating Federated Learning with ChatGPT (a popular language model). Different stakeholders may leverage ChatGPT to search for novel knowledge that complements their existing KGs, however, this approach could potentially introduce ambiguous and wrong triples into the KG. To overcome this, Federated Learning is applied to align and disambiguate the triples using other industrial KGs as super-vision. The proposed method applies a multi-field hyperbolic embedding method to vectorize entities and edges, which are then associatively aggregated to achieve edge replenishment and entity fusion for each KG encrypted. Finally, an incentive win-win mechanism is proposed to motivate diverse stakeholders to contribute to this co-creation actively. A case study is conducted on different industrial KG to evaluate the proposed method. Results demonstrate that this method provides a practical solution for KG co-creation and no compromise to data security. © 2023 IEEE. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3515
-
Xia 2024
Improving Complex Reasoning over Knowledge Graph with Logic-Aware Curriculum Tuning
arXiv 2024;(): 2024 Ref ID: 8272 Answering complex queries over incomplete knowledge graphs (KGs) is a challenging job. Most previous works have focused on learning entity/relation embeddings and simulating first-order logic operators with various neural networks. However, they are bottlenecked by the inability to share world knowledge to improve logical reasoning, thus resulting in suboptimal performance. In this paper, we propose a complex reasoning schema over KG upon large language models (LLMs), containing a curriculum-based logical-aware instruction tuning framework, named LACT. Specifically, we augment the arbitrary first-order logical queries via binary tree decomposition, to stimulate the reasoning capability of LLMs. To address the difficulty gap among different types of complex queries, we design a simple and flexible logic-aware curriculum learning framework. Experiments across widely used datasets demonstrate that LACT has substantial improvements (brings an average +5.5% MRR score) over advanced methods, achieving the new state-of-the-art. Our code and model will be released at GitHub and huggingface soon. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1060
-
Xia 2024
Chain-of-History Reasoning for Temporal Knowledge Graph Forecasting
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():16144-16159 Association for Computational Linguistics (ACL) 2024 Ref ID: 4257 Temporal Knowledge Graph (TKG) forecasting aims to predict future facts based on given histories. Most recent graph-based models excel at capturing structural information within TKGs but lack semantic comprehension abilities. Nowadays, with the surge of LLMs, the LLM-based TKG prediction model has emerged. However, the existing LLM-based model exhibits three shortcomings: (1) It only focuses on the first-order history for prediction while ignoring high-order historical information, resulting in the provided information for LLMs being extremely limited. (2) LLMs struggle with optimal reasoning performance under heavy historical information loads. (3) For TKG prediction, the temporal reasoning capability of LLM alone is limited. To address the first two challenges, we propose Chain-of-History (CoH) reasoning which explores high-order histories step-by-step, achieving effective utilization of high-order historical information for LLMs on TKG prediction. To address the third issue, we design CoH as a plug-and-play module to enhance the performance of graph-based models for TKG prediction. Extensive experiments on three datasets and backbones demonstrate the effectiveness of CoH. © 2024 Association for Computational Linguistics. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1156
-
Xie 2020
Cyber security entity recognition method based on residual dilation convolution neural network
Ch. J. Netw. Inf. Secur. 2020;6(5):126-138 2020 DOI: 10.11959/j.issn.2096-109x.2020009 · Ref ID: 5756 In recent years, cybersecurity threats have increased, and data-driven security intelligence analysis has become a hot research topic in the field of cybersecurity. In particular, the artificial intelligence technology represented by the knowledge graph can provide support for complex cyberattack detection and unknown cyberattack detection in multi-source heterogeneous threat intelligence data. Cybersecurity entity recognition is the basis for the construction of threat intelligence knowledge graphs. The composition of security entities in open network text data is very complex, which makes traditional deep learning methods difficult to identify accurately. Based on the pre-training language model of BERT (pre-training of deep bidirectional transformers), a cybersecurity entity recognition model BERT-RDCNN-CRF based on residual dilation convolutional neural network and conditional random field was proposed. The BERT model was used to train the character-level feature vector representation. Combining the residual convolution and the dilation neural network model to effectively extract the important features of the security entity, and finally obtain the BIO annotation of each character through CRF. Experiments on the large-scale cybersecurity entity annotation dataset constructed show that the proposed method achieves better results than the LSTM-CRF model, the BiLSTM-CRF model and the traditional entity recognition model. © 2020, Beijing Xintong Media Co., Ltd.. All rights reserved. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#880
-
Xie 2021
WebKE: Knowledge Extraction from Semi-structured Web with Pre-trained Markup Language Model
30th ACM International Conference on Information and Knowledge Management (CIKM) 2021;():2211-2220 Univ Queensland, ELECTR NETWORK Assoc Computing Machinery 2021 DOI: 10.1145/3459637.3482491 · Ref ID: 3242 The World Wide Web contains rich up-to-date information for knowledge graph construction. However, most current relation extraction techniques are designed for free text and thus do not handle well semi-structured web content. In this paper, we propose a novel multi-phase machine reading framework, called WebKE. It processes the web content on different granularity by first detecting areas of interest at DOM tree node level and then extracting relational triples for each area. We also propose HTMLBERT as an encoder the web content. It is a pre-trained markup language model that fully leverages the visual layout information and DOM-tree structure, without the need of hand engineered features. Experimental results show that the proposed approach outperforms state-of-the-art methods by a considerable gain. The source code is available at https://github.com/redreamality/webke. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3707
-
Xie 2024
MEMLA: Enhancing Multilingual Knowledge Editing with Neuron-Masked Low-Rank Adaptation
arXiv 2024;(): 2024 Ref ID: 8396 Knowledge editing aims to adjust the knowledge within large language models (LLMs) to prevent their responses from becoming obsolete or inaccurate. However, existing works on knowledge editing are primarily conducted in a single language, which is inadequate for multilingual language models. In this paper, we focus on multilingual knowledge editing (MKE), which requires propagating updates across multiple languages. This necessity poses a significant challenge for the task. Furthermore, the limited availability of a comprehensive dataset for MKE exacerbates this challenge, hindering progress in this area. Hence, we introduce the Multilingual Knowledge Editing Benchmark (MKEB), a novel dataset comprising 12 languages and providing a complete evaluation framework. Additionally, we propose a method that enhances Multilingual knowledge Editing with neuron-Masked Low-Rank Adaptation (MEMLA). Specifically, we identify two categories of knowledge neurons to improve editing precision. Moreover, we perform LoRA-based editing with neuron masks to efficiently modify parameters and facilitate the propagation of updates across multiple languages. Experiments demonstrate that our method outperforms existing baselines and significantly enhances the multi-hop reasoning capability of the edited model, with minimal impact on its downstream task performance. The dataset and code will be made publicly available. |
Mike
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#118
-
Xie 2024
Combining prompt learning with contextual semantics for inductive relation prediction
Inductive relation prediction for knowledge graphs aims to predict missing relations between two new entities. Most previous studies on relation prediction are limited to the transductive setting and could not be applied to it. Recently, some inductive methods have been proposed to handle it by learning the topological semantics. However, they solely rely on structural information, disregarding the role of prior knowledge. In cases of sparse structures, this limitation is magnified, thereby hindering the inductive ability. Prior knowledge can not only filter out invalid topological structures but also complement the topological semantics. To this end, We propose a novel inductive model, PLCS, which incorporates prompt learning with contextual semantics to fully exploit prior knowledge. To filter out irrelevant topological structures, we innovatively employ hard prompts to mine prior knowledge in pre-trained language models (PLMs) as the basis for subgraph extraction. Additionally, we enhance semantic representation by integrating relation text descriptions into relation embeddings during initialization, supplementing topological semantics. The experimental results on four benchmark datasets show the superiority of PLCS over existing state-of-the-art methods. |
Davis
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1148
-
Xie 2024
Creation of a structured solar cell material dataset and performance prediction using large language models
Materials scientists usually collect experimental data to summarize experiences and predict improved materials. However, a crucial issue is how to proficiently utilize unstructured data to update existing structured data, particularly in applied disciplines. This study introduces a new natural language processing (NLP) task called structured information inference (SII) to address this problem. We propose an end-to-end approach to summarize and organize the multi-layered device-level information from the literature into structured data. After comparing different methods, we fine-tuned LLaMA with an F1 score of 87.14% to update an existing perovskite solar cell dataset with articles published since its release, allowing its direct use in subsequent data analysis. Using structured information, we developed regression tasks to predict the electrical performance of solar cells. Our results demonstrate comparable performance to traditional machine-learning methods without feature selection and highlight the potential of large language models for scientific knowledge acquisition and material development. © 2024 The Author(s) |
Kwesi
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3315
-
Xie 2023
DARWIN Series: Domain Specific Large Language Models for Natural Science
arXiv 2023;(): 2023 Ref ID: 7816 Emerging tools bring forth fresh approaches to work, and the field of natural science is no different. In natural science, traditional manual, serial, and labour-intensive work is being augmented by automated, parallel, and iterative processes driven by artificial intelligence-based experimental automation and more. To add new capabilities in natural science, enabling the acceleration and enrichment of automation of the discovery process, we present DARWIN, a series of tailored LLMs for natural science, mainly in physics, chemistry, and material science. This series relies on open-source LLM, incorporating structured and unstructured scientific knowledge from public datasets and literature. We fine-tuned the models using over 60,000 instruction data points, emphasizing factual correctness. During the fine-tuning, we introduce the Scientific Instruction Generation (SIG) model, automating instruction generation from scientific texts. This eliminates the need for manual extraction or domain-specific knowledge graphs and efficiently injects scientific knowledge into the model. We also explore multi-task training strategies, revealing interconnections between scientific tasks. DARWIN series not only achieves state-of-the-art results on various scientific tasks but also diminishes reliance on closed-source AI models. Our research showcases the ability of LLM in the scientific domain, with the overarching goal of fostering prosperity within the broader AI for science community. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1703
-
Xin 2023
Online Knowledge Fusion Method for Fault Diagnosis of Power Plant Equipment
IEEE Joint International Information Technology and Artificial Intelligence Conference (ITAIC) 2023;():1236-1240 Institute of Electrical and Electronics Engineers Inc. 2023 DOI: 10.1109/ITAIC58329.2023.10408849 · Ref ID: 4994 There are many types of documents in fossil-fuel power station to describe equipment failures, including maintenance records, treatment diagnosis suggestions, historical cases, and equipment knowledge. The knowledge of equipment anomaly diagnosis and handling is scattered in different documents. Extract and fuse scattered knowledge from these scattered documents to generate a knowledge graph for equipment fault di-agnosis, providing necessary decision support for maintenance personnel to discover and handle equipment faults. This article proposes an implementation method for extracting and integrating equipment fault knowledge from diverse and multi type text records to form a knowledge graph. A thermal power plant equipment fault Q&A system based on the fusion of open source large language models and knowledge graphs has been developed. Main contributions: (1) A knowledge extraction algorithm integrating BERT-WWM model and pointer annotation method is proposed to extract entity relations of fault text jointly. Experiments show that the method performs well in extracting overlapped triples, and F1 is improved by 8.51 % compared with existing algorithms; (2) A knowledge fusion model based on RoBERTa-BiLSTM is proposed, which fully utilizes the feature information of the entity text to be disambiguated and the entity mention text, and cap-tures the interdependent features within the sentence through attention mechanism. The experiment shows that this method improves F1 by 9.56% compared to existing fusion algorithms. (3) Based on the open-source large model ChatGLM, a fusion method of knowledge graph and large model ChatGLM was explored, and a device fault question answering system for thermal power plants was implemented, achieving high accuracy in practical applications. © 2023 IEEE. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1798
-
Xing 2020
Relation extraction using language model based on knowledge graph
Journal of Physics: Conference Series 2020;1624(): IOP Publishing Ltd 2020 DOI: 10.1088/1742-6596/1624/2/022037 · Ref ID: 5698 Relation extraction is an important task in natural language processing (NLP). The existing methods generally pay more attention on extracting textual semantic information from text, but ignore the relation contextual information from existed relations in datasets, which is very important for the performance of relation extraction task. In this paper, we represent each individual entity as a embedding based on entities and relations knowledge graph, which encodes the relation contextual information between the given entity pairs and relations. Besides, inspired by the impressive performance of language models recently, we used the language model to leverage word semantic information, in which word semantic information can be better captured than word embedding. The experimental results on SemEval2010 Task 8 dataset showed that the F1-score of our proposed method improved nearly 3% compared with the previous methods. © 2020 Institute of Physics Publishing. All rights reserved. |
Mike
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#54
-
Xiong 2022
AutoQGS: Auto-Prompt for Low-Resource Knowledge-based Question Generation from SPARQL
31st ACM International Conference on Information and Knowledge Management (CIKM) 2022;():2250-2259 Atlanta, GA Assoc Computing Machinery 2022 DOI: 10.1145/3511808.3557246 · Ref ID: 3435 This study investigates the task of knowledge-based question generation (KBQG). Conventional KBQG works generated questions from fact triples in the knowledge graph, which could not express complex operations like aggregation and comparison in SPARQL. Moreover, due to the costly annotation of large-scale SPARQL-question pairs, KBQG from SPARQL under low-resource scenarios urgently needs to be explored. Recently, since the generative pre-trained language models (PLMs) typically trained in natural language (NL)-to-NL paradigm have been proven effective for low-resource generation, e.g., T5 and BART, how to effectively utilize them to generate NL-question from non-NL SPARQL is challenging. To address these challenges, AutoQGS, an auto-prompt approach for low-resource KBQG from SPARQL, is proposed. Firstly, we put forward to generate questions directly from SPARQL for the KBQG task to handle complex operations. Secondly, we propose an auto-prompter trained on large-scale unsupervised data to rephrase SPARQL into NL description, smoothing the low-resource transformation from non-NL SPARQL to NL question with PLMs. Experimental results on the WebQuestionsSP, ComlexWebQuestions 1.1, and PathQuestions show that our model achieves state-of-the-art performance, especially in low-resource settings. Furthermore, a corpus of 330k factoid complex question-SPARQL pairs is generated for further KBQG research.(1) |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3292
-
Xiong 2024
Converging Paradigms: The Synergy of Symbolic and Connectionist AI in LLM-Empowered Autonomous Agents
arXiv 2024;(): 2024 Ref ID: 8453 This article explores the convergence of connectionist and symbolic artificial intelligence (AI), from historical debates to contemporary advancements. Traditionally considered distinct paradigms, connectionist AI focuses on neural networks, while symbolic AI emphasizes symbolic representation and logic. Recent advancements in large language models (LLMs), exemplified by ChatGPT and GPT-4, highlight the potential of connectionist architectures in handling human language as a form of symbols. The study argues that LLM-empowered Autonomous Agents (LAAs) embody this paradigm convergence. By utilizing LLMs for text-based knowledge modeling and representation, LAAs integrate neuro-symbolic AI principles, showcasing enhanced reasoning and decision-making capabilities. Comparing LAAs with Knowledge Graphs within the neuro-symbolic AI theme highlights the unique strengths of LAAs in mimicking human-like reasoning processes, scaling effectively with large datasets, and leveraging in-context samples without explicit re-training. The research underscores promising avenues in neuro-vector-symbolic integration, instructional encoding, and implicit reasoning, aimed at further enhancing LAA capabilities. By exploring the progression of neuro-symbolic AI and proposing future research trajectories, this work advances the understanding and development of AI technologies. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3912
-
Xu 2024
Text-Driven Neural Collaborative Filtering Model for Paper Source Tracing
arXiv 2024;(): 2024 Ref ID: 8483 Identifying significant references within the complex interrelations of a citation knowledge graph is challenging, which encompasses connections through citations, authorship, keywords, and other relational attributes. The Paper Source Tracing (PST) task seeks to automate the identification of pivotal references for given scholarly articles utilizing advanced data mining techniques. In the KDD CUP OAG-Challenge PST track, we design a recommendation-based framework tailored for the PST task. This framework employs the Neural Collaborative Filtering (NCF) model to generate final predictions. To process the textual attributes of the papers and extract input features for the model, we utilize SciBERT, a pre-trained language model. According to the experimental results, our method achieved a score of 0.37814 on the Mean Average Precision (MAP) metric, outperforming baseline models and ranking 11th among all participating teams. The source code is publicly available at https://github.com/MyLove-XAB/KDDCupFinal. |
Davis
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3285
-
Xu 2024
Context Graph
arXiv 2024;(): 2024 Ref ID: 8392 Knowledge Graphs (KGs) are foundational structures in many AI applications, representing entities and their interrelations through triples. However, triple-based KGs lack the contextual information of relational knowledge, like temporal dynamics and provenance details, which are crucial for comprehensive knowledge representation and effective reasoning. Instead, ??????? ?????? (CGs) expand upon the conventional structure by incorporating additional information such as time validity, geographic location, and source provenance. This integration provides a more nuanced and accurate understanding of knowledge, enabling KGs to offer richer insights and support more sophisticated reasoning processes. In this work, we first discuss the inherent limitations of triple-based KGs and introduce the concept of CGs, highlighting their advantages in knowledge representation and reasoning. We then present a context graph reasoning ???³ paradigm that leverages large language models (LLMs) to retrieve candidate entities and related contexts, rank them based on the retrieved information, and reason whether sufficient information has been obtained to answer a query. Our experimental results demonstrate that CGR³ significantly improves performance on KG completion (KGC) and KG question answering (KGQA) tasks, validating the effectiveness of incorporating contextual information on KG representation and reasoning. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3109
-
Xu 2024
Automating Bibliometric Analysis with Sentence Transformers and Retrieval-Augmented Generation (RAG): A Pilot Study in Semantic and Contextual Search for Customized Literature Characterization for High-Impact Urban Research
Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Advances in Urban-AI 2024;():43–49 Atlanta, GA, USA Association for Computing Machinery 2024 DOI: 10.1145/3681780.3697252 · Ref ID: 7292 |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3862
-
Xu 2023
Search-in-the-Chain: Interactively Enhancing Large Language Models with Search for Knowledge-intensive Tasks
arXiv 2023;(): 2023 Ref ID: 7684 Making the content generated by Large Language Model (LLM), accurate, credible and traceable is crucial, especially in complex knowledge-intensive tasks that require multi-step reasoning and each step needs knowledge to solve. Retrieval-augmented generation is good potential to solve this problem. However, where and how to introduce Information Retrieval (IR) to LLM is a big challenge. Previous work has the problems that wrong knowledge retrieved by IR misleads the LLM and interaction between IR and LLM breaks the reasoning chain of LLM. This paper proposes a novel framework named ??????-??-???-????? (SearChain) for the interaction between LLM and IR to solve the challenges. First, LLM generates the reasoning chain named Chain-of-Query (CoQ) where each node consists of an IR-oriented query-answer pair. Second, IR verifies the answer of each node of CoQ. It corrects the answer that is not consistent with the retrieved information when IR gives high confidence, which improves the credibility. Third, LLM can indicate its missing knowledge in CoQ and rely on IR to provide this knowledge to LLM. These operations improve the accuracy in terms of reasoning and knowledge. Finally, SearChain generates the reasoning process and marks references to supporting documents for each reasoning step, which improves traceability. Interaction with IR in SearChain forms a novel reasoning path based on a tree, which enables LLM to dynamically modify the direction of reasoning. Experiments show that SearChain outperforms state-of-the-art baselines on complex knowledge-intensive tasks including multi-hop Q&A, slot filling, fact checking, and long-form Q&A. |
Kwesi
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#433
-
Xu 2024
Knowledge graph construction for heart failure using large language models with prompt engineering
Introduction Constructing an accurate and comprehensive knowledge graph of specific diseases is critical for practical clinical disease diagnosis and treatment, reasoning and decision support, rehabilitation, and health management. For knowledge graph construction tasks (such as named entity recognition, relation extraction), classical BERT-based methods require a large amount of training data to ensure model performance. However, real-world medical annotation data, especially disease-specific annotation samples, are very limited. In addition, existing models do not perform well in recognizing out-of-distribution entities and relations that are not seen in the training phase.Method In this study, we present a novel and practical pipeline for constructing a heart failure knowledge graph using large language models and medical expert refinement. We apply prompt engineering to the three phases of schema design: schema design, information extraction, and knowledge completion. The best performance is achieved by designing task-specific prompt templates combined with the TwoStepChat approach.Results Experiments on two datasets show that the TwoStepChat method outperforms the Vanillia prompt and outperforms the fine-tuned BERT-based baselines. Moreover, our method saves 65% of the time compared to manual annotation and is better suited to extract the out-of-distribution information in the real world. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3026
-
Xu 2010
Towards intelligent query processing based on Attribute-Oriented Generalization
2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery 2010;5():2026-2030 2010 DOI: 10.1109/FSKD.2010.5569667 · Ref ID: 6644 Due to the intrinsic characteristics of the relational model, the standard database user interface forces users to be familiar with the database schema and underlying data to improve the efficiency of information retrieval. It proposes an intelligent query processing based on Attribute-Oriented Generalization (AOG) to better user experience during the information retrieval process. Firstly, it adds semantics discarded by the relational model to raw data through attribute-oriented generalization, and builds instances of Type Abstraction Hierarchy (TAH). Secondly, it constructs the knowledge base, which is designed on a relational mode, so that both the knowledge base and the underlying relational database can be handled in a single formalism by a relational query language. Thirdly, with the derived specific knowledge base incorporated into the underlying database, a prototype intelligent, interactive and straightforward query processing system with B/S architecture has been built at the top of SQL, which returns the semantically neighboring values and higher level more abstract values through the instance of TAH according to the results of the interaction with the user. Finally, three practical query examples are presented to further exemplify the main ideas and demonstrate the usefulness of the proposed query answering processes. |
Davis
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#715
-
Xu 2024
A representation learning-based approach to enhancing manufacturing quality for low-voltage electrical products
In low -voltage electrical product manufacturing, resolving quality issues is heavily reliant on engineering experience, and can be time-consuming and error -prone. Through quality management systems, a large number of historical defect cases can be consolidated for analysis along with relevant causes. However, these defect descriptions are often casually described with a mix of Chinese and English language, containing domain -specific terms. Additionally, defect product features have varieties and complex relationships. Therefore, historical defect cases have not been effectively utilized to support manufacturing quality issues. To address this challenge, this study proposes a representation learning -based approach to enhance manufacturing quality. Key research contributions include: (1) A two -stage word embedding technique based on the pre -trained language model. First, TSDAE is utilized for unsupervised pre -training on a large amount of unlabeled data. Then, Sentence -BERT is utilized for fine-tuning on a small set of labeled similar sentence pairs. This process yields a pre -trained language model specific to low -voltage electrical product defects. (2) NSHPSAGE graph embedding model based on the constructed product feature knowledge graph. We select more valuable neighboring nodes during sampling and explore different aggregation functions to enhance graph embedding performance. This model effectively aggregates product feature information into "Defect_Case" nodes, yielding graph embedding vectors. The model exhibits good Weighted -Precision and Weighted -Recall with a short training duration, and it can handle new nodes, addressing the issue of heterogeneous graph embedding. (3) A defect case recommendation technique that fuses word embedding and graph embedding. We use Multi -Head Attention Fusion in the late -fusion to obtain defect case vectors. This approach comprehensively considers defect description semantic knowledge and complex product feature relationships, enabling accurate defect case recommendation with the prototype system. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#729
-
Xu 2024
Retrieval-Augmented Generation with Knowledge Graphs for Customer Service Question Answering
47th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) 2024;():2905-2909 Washington, DC Assoc Computing Machinery 2024 DOI: 10.1145/3626772.3661370 · Ref ID: 3251 In customer service technical support, swiftly and accurately retrieving relevant past issues is critical for efficiently resolving customer inquiries. The conventional retrieval methods in retrievalaugmented generation (RAG) for large language models (LLMs) treat a large corpus of past issue tracking tickets as plain text, ignoring the crucial intra-issue structure and inter-issue relations, which limits performance. We introduce a novel customer service question-answering method that amalgamates RAG with a knowledge graph (KG). Our method constructs a KG from historical issues for use in retrieval, retaining the intra-issue structure and interissue relations. During the question-answering phase, our method parses consumer queries and retrieves related sub-graphs from the KG to generate answers. This integration of a KG not only improves retrieval accuracy by preserving customer service structure information but also enhances answering quality by mitigating the effects of text segmentation. Empirical assessments on our benchmark datasets, utilizing key retrieval (MRR, Recall@K, NDCG@K) and text generation (BLEU, ROUGE, METEOR) metrics, reveal that our method outperforms the baseline by 77.6% in MRR and by 0.32 in BLEU. Our method has been deployed within LinkedIn's customer service team for approximately six months and has reduced the median per-issue resolution time by 28.6%. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#257
-
Xu 2024
Exploring Causal Chain Identification: Comprehensive Insights from Text and Knowledge Graphs
26th International Conference on Data Warehousing and Knowledge Discovery (DaWaK) 2024;14912():129-146 Naples, ITALY Springer International Publishing Ag 2024 DOI: 10.1007/978-3-031-68323-7_11 · Ref ID: 3536 During real-world reasoning, the logic path is generally not explicitly articulated. An appropriate causal chain can offer abundant informative details to depict a logical pathway, which is also beneficial in preventing ambiguity problems during text generation. However, most causal chains tend to lose their causal meaning after multiple hops, also this phenomenon occurs in other chains of relations. To discriminate the broken linkage in chain detection task, we introduce the CK-CEVAE model, Chained domain Knowledge in Cause Effect Variational AutoEncoder, which integrates knowledge into the representation of causal assumptions within chains, employing sequential probabilistic distributions for cause-effect estimation. Our model demonstrates an improvement of around 4% in F1-score over LLM-based and neural-based models in identifying causal chains originating from text. Furthermore, to investigate the semantic continuity of chains within established knowledge graphs, we curate a chain-structured dataset, highlighting both causal relations and multiple non-causal relations, i.e. used for, synonym and similar to, termed ConceptNet-CC dataset. We noticed that the longer the chains, the fewer instances of existence. However, contrary to our intuitions, models perform better at identifying longer chains than shorter ones in uni-directional relations like causes and used for. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3956
-
Xue 2024
Unlock the Power of Frozen LLMs in Knowledge Graph Completion
arXiv 2024;(): 2024 Ref ID: 8527 Traditional knowledge graph completion (KGC) methods rely solely on structural information, struggling with the inherent sparsity of knowledge graphs (KGs). Large Language Models (LLMs) learn extensive knowledge from large corpora with powerful context modeling, making them promising for mitigating the limitations of previous methods. Directly fine-tuning LLMs offers great capability but comes at the cost of huge time and memory consumption, while utilizing frozen LLMs yields suboptimal results.In this work, we aim to leverage LLMs for KGC effectively and efficiently. We capture the context-aware hidden states of knowledge triples by employing prompts to stimulate the intermediate layers of LLMs. We then train a data-efficient classifier on these hidden states to harness the inherent capabilities of frozen LLMs in KGC. Additionally, to reduce ambiguity and enrich knowledge representation, we generate detailed entity descriptions through subgraph sampling on KGs. Extensive experiments on standard benchmarks demonstrate the efficiency and effectiveness of our approach. We outperform traditional KGC methods across most datasets and, notably, achieve classification performance comparable to fine-tuned LLMs while enhancing GPU memory efficiency by 188× and accelerating training and inference by 13.48×. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2254
-
Xueqin 2013
Complex event recognition with uncertainty reasoning
2013 International Conference on Machine Learning and Cybernetics 2013;04():1823-1828 2013 DOI: 10.1109/ICMLC.2013.6890893 · Ref ID: 6271 The goal of complex event recognition considered in this paper is the automatic detection of complex high-level events in videos. This is a difficult task, especially when videos are captured under unconstrained conditions, with poor lighting, heavy background clutter and occlusion. In this paper, we propose a hierarchical knowledge-based framework for complex event recognition. The video event knowledge represents an arbitrary complex spatio-temporal event as a hierarchical composition of simpler events in a natural way. Uncertainty reasoning procedures are applied to interpret low level event descriptions according to the video knowledge base in order to recognize high level scenarios. |
Mike
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2014
-
Yadav 2023
Unleashing the Power of Large Language Model, Textual Embeddings, and Knowledge Graphs for Advanced Information Retrieval
International Conference on Electrical, Computer and Energy Technologies, ICECET 2023 2023;(): Institute of Electrical and Electronics Engineers Inc. 2023 DOI: 10.1109/ICECET58911.2023.10389253 · Ref ID: 4984 Acquiring knowledge beyond the usual expertise is a critical challenge when implementing semantic information solutions for querying a knowledge base. To address this difficulty, one proposed solution was to use knowledge graphs in conjunction with traditional Question & Answering (Q&A) systems. However, this approach struggles with limited facts, difficulty in obtaining further insights into the context, and limited ability to handle complex questions, leading to inaccurate or irrelevant answers. To overcome these limitations, we present an approach for answering inference-based questions that integrates knowledge graphs, a large language model, and relevant embeddings from a vector database. Combining knowledge graphs and word embeddings significantly enhances the strength of both techniques, leading to improved performance of Question and Answering systems. We begin with generating representations of the relevant nodes in the knowledge graph and retrieve the most appropriate information from a collection of stored textual data using word embeddings. This approach tackles the shortcomings of conventional approaches that rely solely on knowledge graphs and are too rigid to handle the nuances of the context. This method provides a sophisticated understanding of language and context, enabling it to handle complex questions that may involve multiple entities and relationships with a better understanding of the facts and context in which the question is being asked. The system's ability to handle complex queries is evidenced through a combination of theoretical analysis and empirical data. Our approach has demonstrated exceptional efficiency on a benchmark dataset, as evidenced by evaluating the F1 score. © 2023 IEEE. |
Mike
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2718
-
Yan 2024
Multi-view Few-shot Reasoning for Emerging Entities in Knowledge Graphs
A knowledge graph (KG) is a form of representing knowledge of the objective world. With the expansion of knowledge, KGs frequently incorporate new entities, which often possess limited associated data, known as few-shot features. Addressing the missing knowledge for these emerging entities is crucial practically, but there are significant challenges due to data scarcity. Previously developed methods based on knowledge graph embedding (KGE) and graph neural networks (GNNs) focusing on instance-level KGs are confronted with challenges of data scarcity and model simplicity, rendering them inapplicable to reasoning tasks in few-shot scenarios. To tackle these issues, we propose a multi-view few-shot KG reasoning method for emerging entities. The primary focus of our method lies in resolving the problem of link prediction for emerging entities with limited associated triples from multiple perspectives. Distinct from previous methods, our approach initially abstracts a concept-view KG from the conventional instance-view KG, enabling the formulation of commonsense rules. Additionally, we employ the aggregation of multi-hop subgraph features to enhance the representation of emerging entities. Furthermore, we introduce a more efficient cross-domain negative sampling strategy and a multi-view triple scoring function based on commonsense rules. Our experimental evaluations highlight the effectiveness of our method in few-shot contexts, demonstrating its robustness and adaptability in both cross-shot and zero-shot scenarios, significantly outperforming existing models in these challenging settings. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3561
-
Yan 2021
K-XLNet: A General Method for Combining Explicit Knowledge with Language Model Pretraining
arXiv 2021;(): 2021 Ref ID: 7458 Though pre-trained language models such as Bert and XLNet, have rapidly advanced the state-of-the-art on many NLP tasks, they implicit semantics only relying on surface information between words in corpus. Intuitively, background knowledge influences the efficacy of understanding. Inspired by this common sense, we focus on improving model pretraining by leveraging explicit knowledge. Different from recent research that optimize pretraining model by knowledge masking strategies, we propose a simple but general method to combine explicit knowledge with pretraining. To be specific, we first match knowledge facts from knowledge graph (KG) and then add a knowledge injunction layer to transformer directly without changing its architecture. The present study seeks to find the direct impact of explicit knowledge on transformer per-training. We conduct experiments on various datasets for different downstream tasks. The experimental results show that solely by adding external knowledge to transformer can improve the learning performance on many NLP tasks. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1542
-
Yan 2024
KNOWNET: Guided Health Information Seeking from LLMs via Knowledge Graph Integration
The increasing reliance on Large Language Models (LLMs) for health information seeking can pose severe risks due to the potential for misinformation and the complexity of these topics. This paper introduces KNOWNET a visualization system that integrates LLMs with Knowledge Graphs (KG) to provide enhanced accuracy and structured exploration. Specifically, for enhanced accuracy, KNOWNET extracts triples (e.g., entities and their relations) from LLM outputs and maps them into the validated information and supported evidence in external KGs. For structured exploration, KNOWNET provides next-step recommendations based on the neighborhood of the currently explored entities in KGs, aiming to guide a comprehensive understanding without overlooking critical aspects. To enable reasoning with both the structured data in KGs and the unstructured outputs from LLMs, KNOWNET conceptualizes the understanding of a subject as the gradual construction of graph visualization. A progressive graph visualization is introduced to monitor past inquiries, and bridge the current query with the exploration history and next-step recommendations. We demonstrate the effectiveness of our system via use cases and expert interviews. © 2024 IEEE. |
Xinchen
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3196
-
Yan 2024
Atomic Fact Decomposition Helps Attributed Question Answering
arXiv 2024;(): 2024 Ref ID: 8741 Attributed Question Answering (AQA) aims to provide both a trustworthy answer and a reliable attribution report for a given question. Retrieval is a widely adopted approach, including two general paradigms: Retrieval-Then-Read (RTR) and post-hoc retrieval. Recently, Large Language Models (LLMs) have shown remarkable proficiency, prompting growing interest in AQA among researchers. However, RTR-based AQA often suffers from irrelevant knowledge and rapidly changing information, even when LLMs are adopted, while post-hoc retrieval-based AQA struggles with comprehending long-form answers with complex logic, and precisely identifying the content needing revision and preserving the original intent. To tackle these problems, this paper proposes an Atomic fact decomposition-based Retrieval and Editing (ARE) framework, which decomposes the generated long-form answers into molecular clauses and atomic facts by the instruction-tuned LLMs. Notably, the instruction-tuned LLMs are fine-tuned using a well-constructed dataset, generated from large scale Knowledge Graphs (KGs). This process involves extracting one-hop neighbors from a given set of entities and transforming the result into coherent long-form text. Subsequently, ARE leverages a search engine to retrieve evidences related to atomic facts, inputting these evidences into an LLM-based verifier to determine whether the facts require expansion for re-retrieval or editing. Furthermore, the edited facts are backtracked into the original answer, with evidence aggregated based on the relationship between molecular clauses and atomic facts. Extensive evaluations demonstrate the superior performance of our proposed method over the state-of-the-arts on several datasets, with an additionally proposed new metric Attr_(p) for evaluating the precision of evidence attribution. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3517
-
Yang 2021
Improving Conversational Recommendation Systems' Quality with Context-Aware Item Meta Information
arXiv 2021;(): 2021 Ref ID: 7507 Conversational recommendation systems (CRS) engage with users by inferring user preferences from dialog history, providing accurate recommendations, and generating appropriate responses. Previous CRSs use knowledge graph (KG) based recommendation modules and integrate KG with language models for response generation. Although KG-based approaches prove effective, two issues remain to be solved. First, KG-based approaches ignore the information in the conversational context but only rely on entity relations and bag of words to recommend items. Second, it requires substantial engineering efforts to maintain KGs that model domain-specific relations, thus leading to less flexibility. In this paper, we propose a simple yet effective architecture comprising a pre-trained language model (PLM) and an item metadata encoder. The encoder learns to map item metadata to embeddings that can reflect the semantic information in the dialog context. The PLM then consumes the semantic-aligned item embeddings together with dialog context to generate high-quality recommendations and responses. Instead of modeling entity relations with KGs, our model reduces engineering complexity by directly converting each item to an embedding. Experimental results on the benchmark dataset ReDial show that our model obtains state-of-the-art results on both recommendation and response generation tasks. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1231
-
Yang 2024
EHAPZero: Ensemble Hierarchical Attribute Prompting Based Zero-Shot Learning for Pest Recognition
Pest recognition is of great significance for achieving sustainable development in agriculture. Nevertheless, due to the wide variety of pest species, subtle inter-species differences, and significant intra-species variations, existing artificial intelligence and Internet of Things (IoT) technologies can only recognize a small number of known pests effectively. In this paper, we propose a zero-shot learning pest recognition framework based on ensemble hierarchical attribute prompting, termed EHAPZero. EHAPZero can identify pest images collected by IoT devices, and then transmit the recognition results to the IoT platform for terminal display. Specifically, the image recognition function is implemented by an attribute generation module (AGM), a hierarchical prompting module (HPM), and a semantic-visual interaction module (SVIM). AGM utilizes large language models to construct a knowledge graph of pests. It employs both node importance evaluation algorithms and manual methods to perform dual filtering on attribute nodes within the graph. Inspired by human knowledge reasoning, HPM dynamically predicts different hierarchical attributes of input images within the Transformer intermediate blocks. These predicted attributes are subsequently injected into the intermediate layer features of the Transformer as prompts. To achieve semantic disambiguation and knowledge transfer, SVIM employs a visual-guided semantic representation method and a semantic-guided visual representation method to strengthen cross-domain interaction between semantics and vision. Finally, the final prediction score is derived through ensemble of prediction results across different levels. Extensive experiments show that EHAPZero achieves the new state-of-theart results on the real-word pest recognition benchmark. The codes are available at: https://github.com/jinqiwen/EHAPZero. © 2014 IEEE. |
Ishan
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3845
-
Yang 2023
A Review on Knowledge Graphs for Healthcare: Resources, Applications, and Promises
arXiv 2023;(): 2023 Ref ID: 7753 Healthcare knowledge graphs (HKGs) are valuable tools for organizing biomedical concepts and their relationships with interpretable structures. The recent advent of large language models (LLMs) has paved the way for building more comprehensive and accurate HKGs. This, in turn, can improve the reliability of generated content and enable better evaluation of LLMs. However, the challenges of HKGs such as regarding data heterogeneity and limited coverage are not fully understood, highlighting the need for detailed reviews. This work provides the first comprehensive review of HKGs. It summarizes the pipeline and key techniques for HKG construction, as well as the common utilization approaches, i.e., model-free and model-based. The existing HKG resources are also organized based on the data types they capture and application domains they cover, along with relevant statistical information (Resource available at https://github.com/lujiaying/Awesome-HealthCare-KnowledgeBase). At the application level, we delve into the successful integration of HKGs across various health domains, ranging from fine-grained basic science research to high-level clinical decision support and public health. Lastly, the paper highlights the opportunities for HKGs in the era of LLMs. This work aims to serve as a valuable resource for understanding the potential and opportunities of HKG in health research. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1572
-
Yang 2024
Learning Choice Nuance for Multiple-Choice Commonsense Question Answering
Proceedings of the International Joint Conference on Neural Networks 2024;(): Institute of Electrical and Electronics Engineers Inc. 2024 DOI: 10.1109/IJCNN60899.2024.10651121 · Ref ID: 4286 Existing models for commonsense question answering (CQA) usually focus on combining pre-trained language models (PLMs) and structured knowledge graphs (KGs) for joint reasoning. However, such approaches encode a QA context (i.e., a pair of the question and a choice) separately from other choices, ineffective for explicitly capturing useful subtle differences among the choices, which results in incorrect answers in some cases. This paper proposes a novel model LNC (Learning Nuance among Choices) for addressing this problem and thus provides an improved approach to multiple-choice question answering. Specifically, LNC explicitly interacts between the text knowledge corresponding to each choice and the external KG knowledge corresponding to each choice, and removes the commonalities among similar choices, allowing the model to focus on different relevant knowledge based on the choices, thereby distinguishing semantically similar choices. Experimental results on major benchmark datasets show that LNC is competitive comparing to the baseline models. © 2024 IEEE. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1292
-
Yang 2023
Expanding the Vocabulary of BERT for Knowledge Base Construction
CEUR Workshop Proceedings 2023;3577(): CEUR-WS 2023 Ref ID: 5066 Knowledge base construction entails acquiring structured information to create a knowledge base of factual and relational data, facilitating question answering, information retrieval, and semantic understanding. The challenge called”Knowledge Base Construction from Pretrained Language Models” at International Semantic Web Conference 2023 defines tasks focused on constructing knowledge base using language model. Our focus was on Track 1 of the challenge, where the parameters are constrained to a maximum of 1 billion, and the inclusion of entity descriptions within the prompt is prohibited. Although the masked language model offers sufficient flexibility to extend its vocabulary, it is not inherently designed for multi-token prediction. To address this, we present Vocabulary Expandable BERT for knowledge base construction, which expand the language model’s vocabulary while preserving semantic embeddings for newly added words. We adopt task-specific re-pre-training on masked language model to further enhance the language model. Through experimentation, the results show the effectiveness of our approaches. Our framework achieves F1 score of 0.323 on the hidden test set and 0.362 on the validation set, both data set is provided by the challenge. Notably, our framework adopts a lightweight language model (BERT-base, 0.13 billion parameters) and surpasses the model using prompts directly on large language model (Chatgpt-3, 175 billion parameters). Besides, Token-Recode achieves comparable performances as Re-pretrain. This research advances language understanding models by enabling the direct embedding of multi-token entities, signifying a substantial step forward in link prediction task in knowledge graph and metadata completion in data management. 1 © 2023 CEUR-WS. All rights reserved. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1907
-
Yang 2024
Structure Prompt Augmented Language Model Embedding on Electrical Equipment Defect Knowledge Graph
Knowledge graphs have demonstrated significant impact in the power grid domain, facilitating various applications such as defect diagnosis and grid management. However, their reasoning capabilities have not been fully exploited. In this paper, we explore the utilization of knowledge graphs for power grid defect diagnosis. We construct an electrical equipment defect knowledge graph and predict missing links, which is also known as Knowledge Graph Completion (KGC). However, we notice the long-tail problem in electrical equipment knowledge graph. To tackle this challenge, we propose a novel text-based model named SPALME (Structure Prompt Augmented Language Model Embedding) that incorporates structural information as prompts. Our model leverages the power of pre-trained language models, allowing it to comprehend the semantic information of entities and relationships in the knowledge graph. Additionally, by integrating structural information as prompts during the learning process, our model gains a deeper understanding of the graph's topological structure efficiently, effectively capturing intricate dependencies between grid equipments. We evaluate our approach on various datasets. The results demonstrate that our model consistently outperforms baseline methods on the majority of the datasets. © 2024 World Scientific Publishing Company. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#611
-
Yang 2020
NMT Enhancement based on Knowledge Graph Mining with Pre-trained Language Model
22nd IEEE International Conference on Advanced Communication Technology (ICACT) 2020;():185-189 Pyeongchang, SOUTH KOREA Ieee 2020 DOI: 10.23919/icact48636.2020.9061292 · Ref ID: 3065 Pre-trained language models like Bert, RoBERT a, GPT, etc. have achieved SOTA effects on multiple NLP tasks (e.g. sentiment classification, information extraction, event extraction, etc.). We propose a simple method based on knowledge graph to improve the quality of machine translation. First, we propose a multi-task learning model that learns subjects, objects, and predicates at the same time. Second, we treat different predicates as different fields, and improve the recognition ability of NMT models in different fields through classification labels. Finally, beam search combined with L2R, R2L rearranges results through entities. Based on the CWMT2018 experimental data, using the predicate's domain classification identifier, the BLUE score increased from 33.58% to 37.63%, and through L2R, R2L rearrangement, the BLEU score increased to 39.25%, overall improvement is more than 5 percentage |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#314
-
Yang 2024
Give us the Facts: Enhancing Large Language Models With Knowledge Graphs for Fact-Aware Language Modeling
IEEE Trans. Knowl. Data Eng. 2024;36(7):3091-3110 2024 DOI: 10.1109/tkde.2024.3360454 · Ref ID: 2944 Recently, ChatGPT, a representative large language model (LLM), has gained considerable attention. Due to their powerful emergent abilities, recent LLMs are considered as a possible alternative to structured knowledge bases like knowledge graphs (KGs). However, while LLMs are proficient at learning probabilistic language patterns and engaging in conversations with humans, they, like previous smaller pre-trained language models (PLMs), still have difficulty in recalling facts while generating knowledge-grounded contents. To overcome these limitations, researchers have proposed enhancing data-driven PLMs with knowledge-based KGs to incorporate explicit factual knowledge into PLMs, thus improving their performance in generating texts requiring factual knowledge and providing more informed responses to user queries. This paper reviews the studies on enhancing PLMs with KGs, detailing existing knowledge graph enhanced pre-trained language models (KGPLMs) as well as their applications. Inspired by existing studies on KGPLM, this paper proposes enhancing LLMs with KGs by developing knowledge graph-enhanced large language models (KGLLMs). KGLLM provides a solution to enhance LLMs' factual reasoning ability, opening up new avenues for LLM research. |
yuexi
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3947
-
Yang 2024
Two Heads Are Better Than One: Integrating Knowledge from Knowledge Graphs and Large Language Models for Entity Alignment
arXiv 2024;(): 2024 Ref ID: 8055 Entity alignment, which is a prerequisite for creating a more comprehensive Knowledge Graph (KG), involves pinpointing equivalent entities across disparate KGs. Contemporary methods for entity alignment have predominantly utilized knowledge embedding models to procure entity embeddings that encapsulate various similarities-structural, relational, and attributive. These embeddings are then integrated through attention-based information fusion mechanisms. Despite this progress, effectively harnessing multifaceted information remains challenging due to inherent heterogeneity. Moreover, while Large Language Models (LLMs) have exhibited exceptional performance across diverse downstream tasks by implicitly capturing entity semantics, this implicit knowledge has yet to be exploited for entity alignment. In this study, we propose a Large Language Model-enhanced Entity Alignment framework (LLMEA), integrating structural knowledge from KGs with semantic knowledge from LLMs to enhance entity alignment. Specifically, LLMEA identifies candidate alignments for a given entity by considering both embedding similarities between entities across KGs and edit distances to a virtual equivalent entity. It then engages an LLM iteratively, posing multiple multi-choice questions to draw upon the LLM's inference capability. The final prediction of the equivalent entity is derived from the LLM's output. Experiments conducted on three public datasets reveal that LLMEA surpasses leading baseline models. Additional ablation studies underscore the efficacy of our proposed framework. |
Srividya
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1721
-
Yang 2024
PEK: A Parameter-Efficient Framework for Knowledge-Grounded Dialogue Generation
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():9261-9273 Association for Computational Linguistics (ACL) 2024 Ref ID: 4228 Pre-trained language models (PLMs) have shown great dialogue generation capability in different scenarios. However, the huge VRAM consumption when fine-tuning them is one of their drawbacks. PEFT approaches can significantly reduce the number of trainable parameters, which enables us to fine-tune larger dialogue generation models. However, the reduction in parameter quantity can diminish a PLM's expressive capacity and affect the PLM's learning from certain specific examples like knowledge-related conversations. Previous works have demonstrated that injecting external knowledge into dialogue generation models can improve the model's performance in knowledge-related conversations. Nonetheless, these methods are designed for the scenario where most parameters of the entire framework are trainable. In this paper, we propose PEK, a parameter-efficient framework for knowledge-enhanced dialogue generation. It enables PLMs to leverage external knowledge documents and knowledge graphs to enhance its generation capabilities with an acceptable number of trainable parameters. Evaluation results on the Wizard of Wikipedia and CMU_DoG datasets show that our approach outperforms baseline methods on multiple evaluation metrics, which validates the effectiveness of our approach. © 2024 Association for Computational Linguistics. |
Xinchen
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#28
-
Yang 2024
Ascle-A Python Natural Language Processing Toolkit for MedicalText Generation:Development and Evaluation Study
Background: Medical texts present significant domain-specific challenges, and manually curating these texts is a time-consumingand labor-intensive process. To address this, natural language processing (NLP) algorithms have been developed to automatetext processing. In the biomedical field, various toolkits for text processing exist, which have greatly improved the efficiency ofhandling unstructured text. However, these existing toolkits tend to emphasize different perspectives, and none of them offergeneration capabilities, leaving a significant gap in the current offerings.Objective: This study aims to describe the development and preliminary evaluation of Ascle. Ascle is tailored for biomedicalresearchers and clinical staff with an easy-to-use, all-in-one solution that requires minimal programming expertise. For the firsttime, Ascle provides 4 advanced and challenging generative functions: question-answering, text summarization, text simplification,and machine translation. In addition, Ascle integrates 12 essential NLP functions, along with query and search capabilities forclinical databases.Methods: We fine-tuned 32 domain-specific language models and evaluated them thoroughly on 27 established benchmarks.In addition, for the question-answering task, we developed a retrieval-augmented generation (RAG) framework for large language models that incorporated a medical knowledge graph with ranking techniques to enhance the reliability of generated answers.Additionally, we conducted a physician validation to assess the quality of generated content beyond automated metrics.Results: The fine-tuned models and RAG framework consistently enhanced text generation tasks. For example, the fine-tunedmodels improved the machine translation task by 20.27 in terms of BLEU score. In the question-answering task, the RAGframework raised the ROUGE-L score by 18% over the vanilla models. Physician validation of generated answers showed highscores for readability (4.95/5) and relevancy (4.43/5), with a lower score for accuracy (3.90/5) and completeness (3.31/5).Conclusions: This study introduces the development and evaluation of Ascle, a user-friendly NLP toolkit designed for medicaltext generation. All code is publicly available through the Ascle GitHub repository. All fine-tuned language models can beaccessed through Hugging Face |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3487
-
Yang 2024
Graphusion: Leveraging Large Language Models for Scientific Knowledge Graph Fusion and Construction in NLP Education
arXiv 2024;(): 2024 Ref ID: 8461 Knowledge graphs (KGs) are crucial in the field of artificial intelligence and are widely applied in downstream tasks, such as enhancing Question Answering (QA) systems. The construction of KGs typically requires significant effort from domain experts. Recently, Large Language Models (LLMs) have been used for knowledge graph construction (KGC), however, most existing approaches focus on a local perspective, extracting knowledge triplets from individual sentences or documents. In this work, we introduce Graphusion, a zero-shot KGC framework from free text. The core fusion module provides a global view of triplets, incorporating entity merging, conflict resolution, and novel triplet discovery. We showcase how Graphusion could be applied to the natural language processing (NLP) domain and validate it in the educational scenario. Specifically, we introduce TutorQA, a new expert-verified benchmark for graph reasoning and QA, comprising six tasks and a total of 1,200 QA pairs. Our evaluation demonstrates that Graphusion surpasses supervised baselines by up to 10% in accuracy on link prediction. Additionally, it achieves average scores of 2.92 and 2.37 out of 3 in human evaluations for concept entity extraction and relation recognition, respectively. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#766
-
Yang 2024
Sequential Recommendation with Latent Relations based on Large Language Model
47th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) 2024;():335-344 Washington, DC Assoc Computing Machinery 2024 DOI: 10.1145/3626772.3657762 · Ref ID: 3375 Sequential recommender systems predict items that may interest users by modeling their preferences based on historical interactions. Traditional sequential recommendation methods rely on capturing implicit collaborative filtering signals among items. Recent relation-aware sequential recommendation models have achieved promising performance by explicitly incorporating item relations into the modeling of user historical sequences, where most relations are extracted from knowledge graphs. However, existing methods rely on manually predefined relations and suffer the sparsity issue, limiting the generalization ability in diverse scenarios with varied item relations. In this paper, we propose a novel relation-aware sequential recommendation framework with Latent Relation Discovery (LRD). Different from previous relation-aware models that rely on predefined rules, we propose to leverage the Large Language Model (LLM) to provide new types of relations and connections between items. The motivation is that LLM contains abundant world knowledge, which can be adopted to mine latent relations of items for recommendation. Specifically, inspired by that humans can describe relations between items using natural language, LRD harnesses the LLM that has demonstrated human-like knowledge to obtain language knowledge representations of items. These representations are fed into a latent relation discovery module based on the discrete state variational autoencoder (DVAE). Then the self-supervised relation discovery tasks and recommendation tasks are jointly optimized. Experimental results on multiple public datasets demonstrate our proposed latent relation discovery method can be incorporated with existing relation-aware sequential recommendation models and significantly improve the performance. Further analysis experiments indicate the effectiveness and reliability of the discovered latent relations. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#22
-
Yang 2022
Approximate inferring with confidence predicting based on uncertain knowledge graph embedding
Uncertainty is a natural character of knowledge, while it is still tough to be encoded into the knowledge graph embedding space that can be employed for machine learning tasks. However, the approximate inference could be performed in the embedding space, if confi-dence, real-value representation of the uncertainty of knowledge facts, can be learned by neural networks. To tackle this, a simple yet effective confidence predicting method is pro-posed, and several approximate inferring are efficiently performed based on these predic-tions. The model is a two-step model: knowledge elements embedding step, in which knowledge facts regarded as short sentences are fed into the natural language model to get entity and relation embedding vectors; and confidence learning step, in which the con-fidence distribution of knowledge facts in the knowledge graph are learned utilizing the recurrent neural network in order to carry out approximate inference. The experience demonstrates that the model achieves better results than state-of-the-art on the link pre-diction task over uncertain knowledge graph embedding. Uncertainty inferring grounded on predicted confidence is more accurate, feasible, and meaningful for serval knowledge inferring tasks: transitivity, composition inferring, and probabilistic soft logic inferring. Likewise, the proposed approach achieves the best tradeoff between efficiency and accu-racy of uncertain knowledge graph embedding and inferring, and can be used to handle large size knowledge graphs at lower time consumption because of the simplicity.(c) 2022 Elsevier Inc. All rights reserved. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1723
-
Yang 2020
Person-relation extraction using bert based knowledge graph
ICIC Express Lett Part B Appl. 2020;11(6):539-544 2020 DOI: 10.24507/icicelb.11.06.539 · Ref ID: 5797 Artificial intelligence technology has been actively researched in the areas of image processing and natural language processing. Recently, with the release of Google’s language model BERT, the importance of artificial intelligence models has attracted attention in the field of natural language processing. In this paper, we propose a knowledge graph to build a model that can extract people in a document using BERT, and to grasp the relationship between people based on the model. In addition, to verify the applicability of person extraction techniques using BERT based knowledge graphs, we conduct a performance comparison experiment with other person extraction models and apply our proposed method to the case study. © 2020, ICIC International. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1577
-
Yang 2021
Learning Knowledge Uncertainty from the Pretrained Language Model
ACM International Conference Proceeding Series 2021;():37-42 Association for Computing Machinery 2021 DOI: 10.1145/3503928.3503936 · Ref ID: 5588 Uncertain knowledge graphs, with each fact assigned a confdence value between 0 and 1, is a kind of graph structured knowledge bases. Knowledge representation is the foundation for most of knowledge-driven applications, in which knowledge are encoded into a continuous vector space for rapidly computing. Unfortunately, it is still a big challenge to encode meaningful knowledge features into the embedding space, such as uncertainty, inferring structures, and commonsense knowledge. An uncertain knowledge graphs embedding model, UKGEbert, is proposed to embed latent commonsense semantics by the pretrained natural language model. In the model, each knowledge fact is treated as a short sentence, which is fed into BERT for training. After that, the model learns uncertainty distribution of knowledge confdence by the recurrent neural network. Experiments on several benchmark datasets show that an e?ective prediction of confdence can help enhancing ability of knowledge inferring in the embedding space. Furthermore, the model achieves state of the art in several main metrics on the link prediction task of uncertain knowledge graphs. © 2021 Association for Computing Machinery. All rights reserved. |
Srividya
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2653
-
Yang 2010
Mapping Relational Databases into Ontologies through a Graph-based Formal Model
2010 Sixth International Conference on Semantics, Knowledge and Grids 2010;():219-226 2010 DOI: 10.1109/SKG.2010.33 · Ref ID: 6432 One of key issues of the Semantic Web applications is the lack of semantic data (ontologies). Although the vast majority of data are stored in the popular relational databases, they are still not easily available for many next generation Web applications. Therefore, one of core challenges of Semantic Web is whether these applications can automatically retrieve semantic information from the existed relational databases. This paper proposes a middle graph-based formal model language, W-graph, a bridge between relational databases and ontologies, which abstracts semantic information from relational database instances semi-automatically and then generates an OWL ontology automatically. This method not only maps relational database schemata to ontologies, but also populates ontologies with data stored in databases. Moreover, a proof of semantic preserving on the mapping is provided, and a case study and an implemented prototype tool are also reported. |
Mike
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1149
-
Yang 2020
Creative storytelling with language models and knowledge graphs
CEUR Workshop Proceedings 2020;2699(): CEUR-WS 2020 Ref ID: 5746 Automated story generation is a popular and well-recognized task in the field of natural language processing. The emergence of pre-trained language models based on large Transformer architectures shows the great capability of text generation. However, language models are limited when the generation requires explicit clues within the context. In this research, we study how to combine knowledge graphs with language models, and build a creative story generation system named DICE. DICE uses external knowledge graphs to provide context clues and implicit knowledge to generate coherent and creative stories. The evaluation shows that our approach can effectively inject the knowledge from knowledge graphs into the stories automatically generated by the language model. © 2020 CEUR-WS. All rights reserved. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3304
-
Yang 2024
CRAG – Comprehensive RAG Benchmark
arXiv 2024;(): 2024 Ref ID: 8363 Retrieval-Augmented Generation (RAG) has recently emerged as a promising solution to alleviate Large Language Model (LLM)'s deficiency in lack of knowledge. Existing RAG datasets, however, do not adequately represent the diverse and dynamic nature of real-world Question Answering (QA) tasks. To bridge this gap, we introduce the Comprehensive RAG Benchmark (CRAG), a factual question answering benchmark of 4,409 question-answer pairs and mock APIs to simulate web and Knowledge Graph (KG) search. CRAG is designed to encapsulate a diverse array of questions across five domains and eight question categories, reflecting varied entity popularity from popular to long-tail, and temporal dynamisms ranging from years to seconds. Our evaluation on this benchmark highlights the gap to fully trustworthy QA. Whereas most advanced LLMs achieve <=34% accuracy on CRAG, adding RAG in a straightforward manner improves the accuracy only to 44%. State-of-the-art industry RAG solutions only answer 63% questions without any hallucination. CRAG also reveals much lower accuracy in answering questions regarding facts with higher dynamism, lower popularity, or higher complexity, suggesting future research directions. The CRAG benchmark laid the groundwork for a KDD Cup 2024 challenge, attracting thousands of participants and submissions within the first 50 days of the competition. We commit to maintaining CRAG to serve research communities in advancing RAG solutions and general QA solutions. |
Kwesi
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2006
-
Yang 2024
UniArk: Improving Generalisation and Consistency for Factual Knowledge Extraction through Debiasing
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2024 2024;1():7011-7028 Association for Computational Linguistics (ACL) 2024 Ref ID: 4423 Several recent papers have investigated the potential of language models as knowledge bases as well as the existence of severe biases when extracting factual knowledge. In this work, we focus on the factual probing performance over unseen prompts from tuning, and using a probabilistic view we show the inherent misalignment between pre-training and downstream tuning objectives in language models for probing knowledge. We hypothesize that simultaneously debiasing these objectives can be the key to generalisation over unseen prompts. We propose an adapter-based framework, UniArk, for generalised and consistent factual knowledge extraction through simple methods without introducing extra parameters. Extensive experiments show that UniArk can significantly improve the model’s out-of-domain generalisation as well as consistency under various prompts. Additionally, we construct ParaTrex, a large-scale and diverse dataset for measuring the inconsistency and out-of-domain generation of models. Further, ParaTrex offers a reference method for constructing paraphrased datasets using large language models. © 2024 Association for Computational Linguistics. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#16
-
Yang 2023
API comparison knowledge extraction via prompt-tuned language model
Application Programming Interfaces (APIs) are frequent in software engineering domain texts, such as API references and Stack Overflow. These APIs and the comparison knowledge between them are not only important for solving programming issues (e.g., question answering), but they are also organized into structured knowledge to support many software engineering tasks (e.g., API misuse detection). As a result, extracting API comparison knowledge (API entities and semantic relations) from texts is essential. Existing rule-based and sequence labeling-based approaches must manually enumerate all linguistic patterns or label a large amount of data. Therefore, they involve a significant labor overhead and are exacerbated by morphological and common-word ambiguity. In contrast to matching or labeling API entities and relations, we formulates heterogeneous API extraction and API relation extraction tasks as a sequence-to-sequence generation task. It proposes APICKnow, an API entity-relation joint extraction model based on the large language model. To improve our model's performance and quick learning ability, we adopt the prompt learning method to stimulate APICKnow to recognize API entities and relations. We systematically evaluate APICKnow on a set of sentences from Stack Overflow. The experimental results show that APICKnow can outperform the state-of-the-art baselines, and APICKnow has a quick learning ability and strong generalization ability. |
Mike
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1434
-
Yang 2024
An Intra-Network Multi-Teacher Distillation Method Towards Lightweight Knowledge Graph Completion
2024 IEEE 9th International Conference on Computational Intelligence and Applications, ICCIA 2024 2024;():109-114 Institute of Electrical and Electronics Engineers Inc. 2024 DOI: 10.1109/ICCIA62557.2024.10719142 · Ref ID: 4182 Recently, Knowledge Graph Completion (KGC) based on Pre-trained Language Models (PLM) has made significant advancements. However, PLM typically have a large number of parameters, which makes lightweight research for low-resource challenging. For KGC, knowledge distillation can be an portable method. But traditional knowledge distillation is difficult to achieve efficient knowledge transfer. To solve this issue, this paper proposes an intra-network multi-teacher knowledge distillation, which can effectively reduce knowledge leakage through multi-level information transmission. Specifically, we divide the teacher model into multiple sub-teachers based on network depth, the sub-teachers deliver different knowledge representations. In addition, we use the loss variation of each sub-teacher as a confidence level, which can dynamically regulate the intensity of multi-teacher distillation and enable the student model to perceive distilled knowledge at a finer granularity. A series of experimental results show that our proposed method achieves state-of-the-art performance with the low number of parameters. © 2024 IEEE. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1146
-
Yang 2023
Coupling Large Language Models with Logic Programming for Robust and General Reasoning from Text
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2023;():5186-5219 Association for Computational Linguistics (ACL) 2023 Ref ID: 5272 While large language models (LLMs), such as GPT-3, appear to be robust and general, their reasoning ability is not at a level to compete with the best models trained for specific natural language reasoning problems. In this study, we observe that a large language model can serve as a highly effective few-shot semantic parser. It can convert natural language sentences into a logical form that serves as input for answer set programs, a logic-based declarative knowledge representation formalism. The combination results in a robust and general system that can handle multiple question-answering tasks without requiring retraining for each new task. It only needs a few examples to guide the LLM's adaptation to a specific task, along with reusable ASP knowledge modules that can be applied to multiple tasks. We demonstrate that this method achieves state-of-the-art performance on several NLP benchmarks, including bAbI, StepGame, CLUTRR, and gSCAN. Additionally, it successfully tackles robot planning tasks that an LLM alone fails to solve. © 2023 Association for Computational Linguistics. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2376
-
Yang 2023
EMoDi: Entity-Enhanced Momentum-Difference Contrastive Learning for Semantic-Aware Verification of Scientific Information
2023 IEEE International Conference on Knowledge Graph (ICKG) 2023;():142-151 2023 DOI: 10.1109/ICKG59574.2023.00023 · Ref ID: 6314 This paper proposes the EMoDi system to improve the performance of the entire scientific information verification pipeline. First, the Momentum-Difference contrastive learning framework is introduced to capture more semantics information. In abstract retrieval, entity-enhancement and noise-ignoration are introduced to improve the ability to retrieve relevant abstracts more accurately. In addition, a two-step verification method is used in label prediction to improve the label prediction ability and reduce the false positive rate of the “NOT ENOUGH INFO” label. The proposed pipeline outperforms the baseline VERISCI and QMUL-SDS. The code of this system is available on GitHub. |
Davis
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3310
-
Yang 2024
CuriousLLM: Elevating Multi-Document QA with Reasoning-Infused Knowledge Graph Prompting
arXiv 2024;(): 2024 Ref ID: 8234 In the field of Question Answering (QA), unifying large language models (LLMs) with external databases has shown great success. However, these methods often fall short in providing the advanced reasoning needed for complex QA tasks. To address these issues, we improve over a novel approach called Knowledge Graph Prompting (KGP), which combines knowledge graphs with a LLM-based agent to improve reasoning and search accuracy. Nevertheless, the original KGP framework necessitates costly fine-tuning with large datasets yet still suffers from LLM hallucination. Therefore, we propose a reasoning-infused LLM agent to enhance this framework. This agent mimics human curiosity to ask follow-up questions to more efficiently navigate the search. This simple modification significantly boosts the LLM performance in QA tasks without the high costs and latency associated with the initial KGP framework. Our ultimate goal is to further develop this approach, leading to more accurate, faster, and cost-effective solutions in the QA domain. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3568
-
Yao 2019
KG-BERT: BERT for Knowledge Graph Completion
arXiv 2019;(): 2019 Ref ID: 7374 Knowledge graphs are important resources for many artificial intelligence tasks but often suffer from incompleteness. In this work, we propose to use pre-trained language models for knowledge graph completion. We treat triples in knowledge graphs as textual sequences and propose a novel framework named Knowledge Graph Bidirectional Encoder Representations from Transformer (KG-BERT) to model these triples. Our method takes entity and relation descriptions of a triple as input and computes scoring function of the triple with the KG-BERT language model. Experimental results on multiple benchmark knowledge graphs show that our method can achieve state-of-the-art performance in triple classification, link prediction and relation prediction tasks. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#11
-
Yao 2024
AgCNER, the First Large-Scale Chinese Named Entity Recognition Dataset for Agricultural Diseases and Pests
Named entity recognition is a fundamental subtask for knowledge graph construction and question-answering in the agricultural diseases and pests field. Although several works have been done, the scarcity of the Chinese annotated dataset has restricted the development of agricultural diseases and pests named entity recognition(ADP-NER). To address the issues, a large-scale corpus for the Chinese ADP-NER task named AgCNER was first annotated. It mainly contains 13 categories, 206,992 entities, and 66,553 samples with 3,909,293 characters. Compared with other datasets, AgCNER maintains the best performance in terms of the number of categories, entities, samples, and characters. Moreover, this is the first publicly available corpus for the agricultural field. In addition, the agricultural language model AgBERT is also fine-tuned and released. Finally, the comprehensive experimental results showed that BiLSTM-CRF achieved F1-score of 93.58%, which would be further improved to 94.14% using BERT. The analysis from multiple aspects has verified the rationality of AgCNER and the effectiveness of AgBERT. The annotated corpus and fine-tuned language model are publicly available at https://doi.org/XXX and https://github.com/guojson/AgCNER.git. |
Mike
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2593
-
Yao 2024
Internal-External Information Enhanced Causal Reasoning
2024 International Joint Conference on Neural Networks (IJCNN) 2024;():1-8 2024 DOI: 10.1109/IJCNN60899.2024.10651415 · Ref ID: 6038 Causal reasoning is vitally important for various natural language processing, which needs text semantic understanding and rich knowledge information reserve. Causal question-answering (CQA), one of the causal reasoning tasks, aims to choose either the cause or effect of a given story sentence. It requires both background causal knowledge and the ability to infer cause-effect relations. However, existing studies ignore the logical and commonsense relationship between the contexts, which limits the model capability. In this paper, we propose a novel model of Semantic Internal-External Enhancement (SIEE) by enhancing both the internal and external knowledge. The model employs Abstract Meaning Representation (AMR) to capture the core semantic information and explicit structures. In addition, we explore the commonsense knowledge behind the key information in the context to provide more clues for reasoning. Finally, we combine the above internal and external information by using a semantic aggregator to aggregate the semantic information of neighbors on the keyword nodes. Experimental studies show the competitive performance of our proposed model over the state-of-the-art published results on three CQA benchmarks, e-CARE, COPA and BCOPA. |
Srividya
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#742
-
Yao 2023
Schema-aware Reference as Prompt Improves Data-Efficient Knowledge Graph Construction
46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) 2023;():911-921 Taipei, TAIWAN Assoc Computing Machinery 2023 DOI: 10.1145/3539618.3591763 · Ref ID: 2952 With the development of pre-trained language models, many prompt-based approaches to data-efficient knowledge graph construction have achieved impressive performance. However, existing prompt-based learning methods for knowledge graph construction are still susceptible to several potential limitations: (i) semantic gap between natural language and output structured knowledge with pre-defined schema, which means model cannot fully exploit semantic knowledge with the constrained templates; (ii) representation learning with locally individual instances limits the performance given the insufficient features, which are unable to unleash the potential analogical capability of pre-trained language models. Motivated by these observations, we propose a retrieval-augmented approach, which retrieves schema-aware Reference As Prompt (RAP), for data-efficient knowledge graph construction. It can dynamically leverage schema and knowledge inherited from human-annotated and weak-supervised data as a prompt for each sample, which is model-agnostic and can be plugged into widespread existing approaches. Experimental results demonstrate that previous methods integrated with RAP can achieve impressive performance gains in low-resource settings on five datasets of relational triple extraction and event extraction for knowledge graph construction(1). |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2066
-
Yao 2022
Extracting Biomedical Factual Knowledge Using Pretrained Language Model and Electronic Health Record Context
AMIA Annu Symp Proc 2022;2022():1188-1197 2022 Ref ID: 5866 Language Models (LMs) have performed well on biomedical natural language processing applications. In this study, we conducted some experiments to use prompt methods to extract knowledge from LMs as new knowledge Bases (LMs as KBs). However, prompting can only be used as a low bound for knowledge extraction, and perform particularly poorly on biomedical domain KBs. In order to make LMs as KBs more in line with the actual application scenarios of the biomedical domain, we specifically add EHR notes as context to the prompt to improve the low bound in the biomedical domain. We design and validate a series of experiments for our Dynamic-Context-BioLAMA task. Our experiments show that the knowledge possessed by those language models can distinguish the correct knowledge from the noise knowledge in the EHR notes, and such distinguishing ability can also be used as a new metric to evaluate the amount of knowledge possessed by the model. |
Davis
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3861
-
Yao 2024
SeaKR: Self-aware Knowledge Retrieval for Adaptive Retrieval Augmented Generation
arXiv 2024;(): 2024 Ref ID: 8430 This paper introduces Self-aware Knowledge Retrieval (SeaKR), a novel adaptive RAG model that extracts self-aware uncertainty of LLMs from their internal states. SeaKR activates retrieval when the LLMs present high self-aware uncertainty for generation. To effectively integrate retrieved knowledge snippets, SeaKR re-ranks them based on LLM's self-aware uncertainty to preserve the snippet that reduces their uncertainty to the utmost. To facilitate solving complex tasks that require multiple retrievals, SeaKR utilizes their self-aware uncertainty to choose among different reasoning strategies. Our experiments on both complex and simple Question Answering datasets show that SeaKR outperforms existing adaptive RAG methods. We release our code at https://github.com/THU-KEG/SeaKR. |
Srividya
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#159
-
Yasunaga 2022
Deep Bidirectional Language-Knowledge Graph Pretraining
36th Conference on Neural Information Processing Systems (NeurIPS) 2022;(): Electr Network Neural Information Processing Systems (Nips) 2022 Ref ID: 3034 Pretraining a language model (LM) on text has been shown to help various downstream NLP tasks. Recent works show that a knowledge graph (KG) can complement text data, offering structured background knowledge that provides a useful scaffold for reasoning. However, these works are not pretrained to learn a deep fusion of the two modalities at scale, limiting the potential to acquire fully joint representations of text and KG. Here we propose DRAGON (Deep Bidirectional Language-Knowledge Graph Pretraining), a self-supervised method to pretrain a deeply joint language-knowledge foundation model from text and KG at scale. Specifically, our model takes pairs of text segments and relevant KG subgraphs as input and bidirectionally fuses information from both modalities. We pretrain this model by unifying two self-supervised reasoning tasks, masked language modeling and KG link prediction. DRAGON outperforms existing LM and LM+KG models on diverse downstream tasks including question answering across general and biomedical domains, with +5% absolute gain on average. In particular, DRAGON achieves strong performance on complex reasoning about language and knowledge (+10% on questions involving long contexts or multi-step reasoning) and low-resource QA (+8% on OBQA and RiddleSense), and new state-of-the-art results on various BioNLP tasks. Our code and trained models are available at https://github.com/michiyasunaga/dragon. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3221
-
Ye 2023
Beyond Isolation: Multi-Agent Synergy for Improving Knowledge Graph Construction
arXiv 2023;(): 2023 Ref ID: 7972 Knowledge graph construction (KGC) is a multifaceted undertaking involving the extraction of entities, relations, and events. Traditionally, large language models (LLMs) have been viewed as solitary task-solving agents in this complex landscape. However, this paper challenges this paradigm by introducing a novel framework, CooperKGC. Departing from the conventional approach, CooperKGC establishes a collaborative processing network, assembling a KGC collaboration team capable of concurrently addressing entity, relation, and event extraction tasks. Our experiments unequivocally demonstrate that fostering collaboration and information interaction among diverse agents within CooperKGC yields superior results compared to individual cognitive processes operating in isolation. Importantly, our findings reveal that the collaboration facilitated by CooperKGC enhances knowledge selection, correction, and aggregation capabilities across multiple rounds of interactions. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3794
-
Ye 2023
Qilin-Med: Multi-stage Knowledge Injection Advanced Medical Large Language Model
arXiv 2023;(): 2023 Ref ID: 7894 Integrating large language models (LLMs) into healthcare holds great potential but faces challenges. Pre-training LLMs from scratch for domains like medicine is resource-heavy and often unfeasible. On the other hand, sole reliance on Supervised Fine-tuning (SFT) can result in overconfident predictions and may not tap into domain-specific insights. In response, we present a multi-stage training method combining Domain-specific Continued Pre-training (DCPT), SFT, and Direct Preference Optimization (DPO). In addition, we publish a 3Gb Chinese Medicine (ChiMed) dataset, encompassing medical question answering, plain texts, knowledge graphs, and dialogues, segmented into three training stages. The medical LLM trained with our pipeline, Qilin-Med, shows substantial performance improvement. In the CPT and SFT phases, Qilin-Med achieved 38.4% and 40.0% accuracy on the CMExam test set, respectively. It outperformed the basemodel Baichuan-7B (accuracy: 33.5%), by 7.5%. In the DPO phase, it scored 16.66 in BLEU-1 and 27.44 in ROUGE-1 on the Huatuo-26M test set, bringing further improvement to the SFT phase (12.69 in BLEU-1 and 24.21 in ROUGE-1). Additionally, we have further enhanced the model's performance through the Retrieval Augmented Generation (RAG) approach. Experiments demonstrate that Qilin-Med-RAG achieves an accuracy rate of 42.8% on CMExam. These results highlight the contribution of our novel training approach in building LLMs for medical applications. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2285
-
Ye 2024
Correcting Factual Errors in LLMs via Inference Paths Based on Knowledge Graph
2024 International Conference on Computational Linguistics and Natural Language Processing (CLNLP) 2024;():12-16 2024 DOI: 10.1109/CLNLP64123.2024.00011 · Ref ID: 7013 Large language models (LLMs) have been observed to occasionally exhibit hallucination, a phenomenon where they generate statements unsupported by factual evidence, thereby compromising the trustworthiness of their output. Current approaches to mitigating this problem largely rely on extracting a single triplet from a knowledge graph, which fails to adequately capture the complex and interlinked nature of factual reasoning. In an effort to address this critical challenge, this paper delves into the utilization of inference paths based on knowledge graph for factual error correction of LLMs. At the heart of our approach lies the deployment of deep reinforcement learning algorithms, which traverse the knowledge graph to retrieve inference paths. These paths, replete with contextual depth and logical coherence, thereby amending the content and diminishing the incidence of factual discrepancies in the reasoning process of LLMs. Experimental results demonstrate that our approach markedly enhances the factual QA performance of LLMs. Furthermore, it shows great potential in improving the reliability of LLMs in complex reasoning scenarios, highlighting the effectiveness of inference path derived from knowledge graph. |
yuexi
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3283
-
Ye 2024
Construction and Application of Materials Knowledge Graph in Multidisciplinary Materials Science via Large Language Model
arXiv 2024;(): 2024 Ref ID: 8216 Knowledge in materials science is widely dispersed across extensive scientific literature, posing significant challenges for efficient discovery and integration of new materials. Traditional methods, often reliant on costly and time-consuming experimental approaches, further complicate rapid innovation. Addressing these challenges, the integration of artificial intelligence with materials science has opened avenues for accelerating the discovery process, though it also demands precise annotation, data extraction, and traceability of information. To tackle these issues, this article introduces the Materials Knowledge Graph (MKG), which utilizes advanced natural language processing techniques, integrated with large language models to extract and systematically organize a decade's worth of high-quality research into structured triples, contains 162,605 nodes and 731,772 edges. MKG categorizes information into comprehensive labels such as Name, Formula, and Application, structured around a meticulously designed ontology, thus enhancing data usability and integration. By implementing network-based algorithms, MKG not only facilitates efficient link prediction but also significantly reduces reliance on traditional experimental methods. This structured approach not only streamlines materials research but also lays the groundwork for more sophisticated science knowledge graphs. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#981
-
Yin 2023
ALCUNA: Large Language Models Meet New Knowledge
EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings 2023;():1397-1414 Association for Computational Linguistics (ACL) 2023 DOI: 10.18653/v1/2023.emnlp-main.87 · Ref ID: 5081 With the rapid development of NLP, large-scale language models (LLMs) excel in various tasks across multiple domains now. However, existing benchmarks may not adequately measure these models' capabilities, especially when faced with new knowledge. In this paper, we address the lack of benchmarks to evaluate LLMs' ability to handle new knowledge, an important and challenging aspect in the rapidly evolving world. We propose an approach called KnowGen that generates new knowledge by altering existing entity attributes and relationships, resulting in artificial entities that are distinct from real-world entities. With KnowGen, we introduce a benchmark named ALCUNA to assess LLMs' abilities in knowledge understanding, differentiation, and association. We benchmark several LLMs, reveals that their performance in face of new knowledge is not satisfactory, particularly in reasoning between new and internal knowledge. We also explore the impact of entity similarity on the model's understanding of entity knowledge and the influence of contextual entities. We appeal to the need for caution when using LLMs in new scenarios or with new knowledge, and hope that our benchmarks can help drive the development of LLMs in face of new knowledge. ©2023 Association for Computational Linguistics. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1017
-
Yin 2024
Benchmarking Knowledge Boundary for Large Language Models: A Different Perspective on Model Evaluation
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;1():2270-2286 Association for Computational Linguistics (ACL) 2024 Ref ID: 4306 In recent years, substantial advancements have been made in the development of large language models, achieving remarkable performance across diverse tasks. To evaluate the knowledge ability of language models, previous studies have proposed lots of benchmarks based on question-answering pairs. We argue that it is not reliable and comprehensive to evaluate language models with a fixed question or limited paraphrases as the query, since language models are sensitive to prompt. Therefore, we introduce a novel concept named knowledge boundary to encompass both prompt-agnostic and prompt-sensitive knowledge within language models. Knowledge boundary avoids prompt sensitivity in language model evaluations, rendering them more dependable and robust. To explore the knowledge boundary for a given model, we propose a projected gradient descent method with semantic constraints, a new algorithm designed to identify the optimal prompt for each piece of knowledge. Experiments demonstrate a superior performance of our algorithm in computing the knowledge boundary compared to existing methods. Furthermore, we evaluate the ability of multiple language models in several domains with knowledge boundary. © 2024 Association for Computational Linguistics. |
Xinchen
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1479
-
Youn 2023
KGLM: Integrating Knowledge Graph Structure in Language Models for Link Prediction
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2023;():217-224 Association for Computational Linguistics (ACL) 2023 DOI: 10.18653/v1/2023.starsem-1.20 · Ref ID: 5114 The ability of knowledge graphs to represent complex relationships at scale has led to their adoption for various needs including knowledge representation, question-answering, and recommendation systems. Knowledge graphs are often incomplete in the information they represent, necessitating the need for knowledge graph completion tasks. Pre-trained and finetuned language models have shown promise in these tasks although these models ignore the intrinsic information encoded in the knowledge graph, namely the entity and relation types. In this work, we propose the Knowledge Graph Language Model (KGLM) architecture, where we introduce a new entity/relation embedding layer that learns to differentiate distinctive entity and relation types, therefore allowing the model to learn the structure of the knowledge graph. In this work, we show that further pretraining the language models with this additional embedding layer using the triples extracted from the knowledge graph, followed by the standard fine-tuning phase sets a new state-of-the-art performance for the link prediction task on the benchmark datasets. © 2023 Association for Computational Linguistics. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3375
-
Youssef 2024
Enhancing Fact Retrieval in PLMs through Truthfulness
arXiv 2024;(): 2024 Ref ID: 8722 Pre-trained Language Models (PLMs) encode various facts about the world at their pre-training phase as they are trained to predict the next or missing word in a sentence. There has a been an interest in quantifying and improving the amount of facts that can be extracted from PLMs, as they have been envisioned to act as soft knowledge bases, which can be queried in natural language. Different approaches exist to enhance fact retrieval from PLM. Recent work shows that the hidden states of PLMs can be leveraged to determine the truthfulness of the PLMs' inputs. Leveraging this finding to improve factual knowledge retrieval remains unexplored. In this work, we investigate the use of a helper model to improve fact retrieval. The helper model assesses the truthfulness of an input based on the corresponding hidden states representations from the PLMs. We evaluate this approach on several masked PLMs and show that it enhances fact retrieval by up to 33%. Our findings highlight the potential of hidden states representations from PLMs in improving their factual knowledge retrieval. |
yuexi
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1251
-
Yu 2024
Enhancing Distractor Generation for Multiple-Choice Questions with Retrieval Augmented Pretraining and Knowledge Graph Integration
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():11019-11029 Association for Computational Linguistics (ACL) 2024 Ref ID: 4241 In this paper, we tackle the task of distractor generation (DG) for multiple-choice questions. Our study introduces two key designs. First, we propose retrieval augmented pretraining, which involves refining the language model pretraining to align it more closely with the downstream task of DG. Second, we explore the integration of knowledge graphs to enhance the performance of DG. Through experiments with benchmarking datasets, we show that our models significantly outperform the state-ofthe-art results. Our best-performing model advances the F1@3 score from 14.80 to 16.47 in MCQ dataset and from 15.92 to 16.50 in Sciq dataset. © 2024 Association for Computational Linguistics. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2585
-
Yu 2007
Intelligent Software Agent Design Tool Using Goal Net Methodology
2007 IEEE/WIC/ACM International Conference on Intelligent Agent Technology (IAT'07) 2007;():43-46 2007 DOI: 10.1109/IAT.2007.25 · Ref ID: 7077 Intelligent agent is a fast emerging technology and has wide range of applications. Although there are several tools for agent development, there is few design tool to assist the conversion from paper based agent mental state design to effective representation of them in abstract data structures which can be used by the agent management system to create intelligent software agents. This paper proposes the goal net designer which is an integrated development environment (IDE) for modeling agent behavior based on goal net model, a goal orient methodology. It provides a way for the users to simplify the various stages of the design process and automatically generate design data which can be used by the multi- agent development environment (MADE) to automatically create intelligent agents. The system reduces the level of skills required for developing agent augmented application to such an extent that users with little knowledge in intelligent software agent technology can easily add intelligent agents into their applications and save time and cost involved in the development process. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3378
-
Yu 2024
Enhancing Healthcare through Large Language Models: A Study on Medical Question Answering
arXiv 2024;(): 2024 Ref ID: 8518 In recent years, the application of Large Language Models (LLMs) in healthcare has shown significant promise in improving the accessibility and dissemination of medical knowledge. This paper presents a detailed study of various LLMs trained on the MedQuAD medical question-answering dataset, with a focus on identifying the most effective model for providing accurate medical information. Among the models tested, the Sentence-t5 combined with Mistral 7B demonstrated superior performance, achieving a precision score of 0.762. This model's enhanced capabilities are attributed to its advanced pretraining techniques, robust architecture, and effective prompt construction methodologies. By leveraging these strengths, the Sentence-t5 + Mistral 7B model excels in understanding and generating precise medical answers. Our findings highlight the potential of integrating sophisticated LLMs in medical contexts to facilitate efficient and accurate medical knowledge retrieval, thus significantly enhancing patient education and support. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2212
-
Yu 2010
Building a context world for dynamic service composition
5th International Conference on Pervasive Computing and Applications 2010;():336-341 2010 DOI: 10.1109/ICPCA.2010.5704123 · Ref ID: 6851 Dynamic service composition requires responding and adapting to changes in the computing environment when orchestrating existing services into one or more new services that fit better to a composite application. This paper abstracts the changes of the environment as a context world to store the physical contexts of the computing environment, user profiles and computed results of services as well. We use ontology techniques to model the domain concepts of application contexts. Context Condition/Effect Description Language is designed to describe the dynamic semantics of the requirements and capabilities of goals and services in a concise and editable manner. Goal-driven and planning techniques are used to dynamically implement the service composition according to the domain knowledge and facts in the context world. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#57
-
Yu 2023
BEAR: Revolutionizing Service Domain Knowledge Graph Construction with LLM
21st International Conference on Service-Oriented Computing (ICSOC) 2023;14419():339-346 Rome, ITALY Springer International Publishing Ag 2023 DOI: 10.1007/978-3-031-48421-6_23 · Ref ID: 3259 Knowledge graph (KG), as a novel knowledge storage approach, has been widely used in various domains. In the service computing community, researchers tried to harness the enormous potential of KG to tackle domain-specific tasks. However, the lack of an openly available service domain KG limits the in-depth exploration of KGs in domain-specific applications. Building a service domain KG primarily faces two challenges: first, the diversity and complexity of service domain knowledge, and second, the dispersion of domain knowledge and the lack of annotated data. These challenges discouraged costly investment in large, high-quality domain-specific KGs by researchers. In this paper, we present the construction of a service domain KG called BEAR. We design a comprehensive service domain knowledge ontology to automatically generate the prompts for the Large Language Model (LLM) and employ LLM to implement a zero-shot method to extract high-quality knowledge. A series of experiments are conducted to demonstrate the feasibility of graph construction process and showcase the richness of content available from BEAR. Currently, BEAR includes 133, 906 nodes, 169, 159 relations, and about 424, 000 factual knowledge as attributes, which is available through github.com/HTXone/BEAR. |
yuexi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3111
-
Yu 2023
BEAR: Revolutionizing Service Domain Knowledge Graph Construction with LLM
Service-Oriented Computing: 21st International Conference, ICSOC 2023, Rome, Italy, November 28 – December 1, 2023, Proceedings, Part I 2023;():339–346 Rome, Italy Springer-Verlag 2023 DOI: 10.1007/978-3-031-48421-6_23 · Ref ID: 7110 |
Mike
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#743
-
Yu 2023
The Second Workshop on Knowledge-Augmented Methods for Natural Language Processing
29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) 2023;():5899-5900 Long Beach, CA Assoc Computing Machinery 2023 DOI: 10.1145/3580305.3599233 · Ref ID: 3238 Language models are being developed and deployed in many applications, "small"-scale and large-scale, generic and specialized, text-only and multimodal, etc. Meanwhile, the missingness of important knowledge causes limitations and safety challenges. The knowledge includes commonsense, world facts, domain expertise, personalization, and especially the unique patterns that need to be discovered from big data applications. Training and inference processes of the language models can be and should be augmented with the knowledge. The first KnowledgeNLP at AAAI 2023 attracted scientists on knowledge augmentation methods towards higher language intelligence. This workshop offers a broad platform to share ideas and discuss various topics, such as (1) synergy between knowledge and language model, (2) scalable architectures that integrate NLP, knowledge graph, and graph learning technologies, (3) KnowledgeNLP for e-commerce, education, and healthcare, (4) human factors and social good in KnowledgeNLP. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3725
-
Yu 2023
A Multimodal Ecological Civilization Pattern Recommendation Method Based on Large Language Models and Knowledge Graph
arXiv 2023;(): 2023 Ref ID: 7912 The Ecological Civilization Pattern Recommendation System (ECPRS) aims to recommend suitable ecological civilization patterns for target regions, promoting sustainable development and reducing regional disparities. However, the current representative recommendation methods are not suitable for recommending ecological civilization patterns in a geographical context. There are two reasons for this. Firstly, regions have spatial heterogeneity, and the (ECPRS)needs to consider factors like climate, topography, vegetation, etc., to recommend civilization patterns adapted to specific ecological environments, ensuring the feasibility and practicality of the recommendations. Secondly, the abstract features of the ecological civilization patterns in the real world have not been fully utilized., resulting in poor richness in their embedding representations and consequently, lower performance of the recommendation system. Considering these limitations, we propose the ECPR-MML method. Initially, based on the novel method UGPIG, we construct a knowledge graph to extract regional representations incorporating spatial heterogeneity features. Following that, inspired by the significant progress made by Large Language Models (LLMs) in the field of Natural Language Processing (NLP), we employ Large LLMs to generate multimodal features for ecological civilization patterns in the form of text and images. We extract and integrate these multimodal features to obtain semantically rich representations of ecological civilization. Through extensive experiments, we validate the performance of our ECPR-MML model. Our results show that F1@5 is 2.11% higher compared to state-of-the-art models, 2.02% higher than NGCF, and 1.16% higher than UGPIG. Furthermore, multimodal data can indeed enhance recommendation performance. However, the data generated by LLM is not as effective as real data to a certain extent. |
Mike
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3994
-
Yuan 2024
Whispers that Shake Foundations: Analyzing and Mitigating False Premise Hallucinations in Large Language Models
arXiv 2024;(): 2024 Ref ID: 8147 Large Language Models (LLMs) have shown impressive capabilities but still suffer from the issue of hallucinations. A significant type of this issue is the false premise hallucination, which we define as the phenomenon when LLMs generate hallucinated text when confronted with false premise questions. In this paper, we perform a comprehensive analysis of the false premise hallucination and elucidate its internal working mechanism: a small subset of attention heads (which we designate as false premise heads) disturb the knowledge extraction process, leading to the occurrence of false premise hallucination. Based on our analysis, we propose ????? (?alse premise ?ttention head constra?ining for mi?igating ?allucinations), a novel and effective method to mitigate false premise hallucinations. It constrains the false premise attention heads during the model inference process. Impressively, extensive experiments demonstrate that constraining only approximately 1% of the attention heads in the model yields a notable increase of nearly 20% of model performance. |
Xinchen
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3167
-
Yuan 2024
VisionKG: Unleashing the Power of Visual Datasets via Knowledge Graph
The Semantic Web: 21st International Conference, ESWC 2024, Hersonissos, Crete, Greece, May 26–30, 2024, Proceedings, Part II 2024;():75–93 Hersonissos, Greece Springer-Verlag 2024 DOI: 10.1007/978-3-031-60635-9_5 · Ref ID: 7139 |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3635
-
Yuan 2023
Large Language Models Illuminate a Progressive Pathway to Artificial Healthcare Assistant: A Review
arXiv 2023;(): 2023 Ref ID: 7922 With the rapid development of artificial intelligence, large language models (LLMs) have shown promising capabilities in mimicking human-level language comprehension and reasoning. This has sparked significant interest in applying LLMs to enhance various aspects of healthcare, ranging from medical education to clinical decision support. However, medicine involves multifaceted data modalities and nuanced reasoning skills, presenting challenges for integrating LLMs. This paper provides a comprehensive review on the applications and implications of LLMs in medicine. It begins by examining the fundamental applications of general-purpose and specialized LLMs, demonstrating their utilities in knowledge retrieval, research support, clinical workflow automation, and diagnostic assistance. Recognizing the inherent multimodality of medicine, the review then focuses on multimodal LLMs, investigating their ability to process diverse data types like medical imaging and EHRs to augment diagnostic accuracy. To address LLMs' limitations regarding personalization and complex clinical reasoning, the paper explores the emerging development of LLM-powered autonomous agents for healthcare. Furthermore, it summarizes the evaluation methodologies for assessing LLMs' reliability and safety in medical contexts. Overall, this review offers an extensive analysis on the transformative potential of LLMs in modern medicine. It also highlights the pivotal need for continuous optimizations and ethical oversight before these models can be effectively integrated into clinical practice. Visit https://github.com/mingze-yuan/Awesome-LLM-Healthcare for an accompanying GitHub repository containing latest papers. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#985
-
Yuan 2024
ANALOGYKB: Unlocking Analogical Reasoning of Language Models with A Million-scale Knowledge Base
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;1():1249-1265 Association for Computational Linguistics (ACL) 2024 Ref ID: 4385 Analogical reasoning is a fundamental cognitive ability of humans. However, current language models (LMs) still struggle to achieve human-like performance in analogical reasoning tasks due to a lack of resources for model training. In this work, we address this gap by proposing ANALOGYKB, a million-scale analogy knowledge base (KB) derived from existing knowledge graphs (KGs). ANALOGYKB identifies two types of analogies from the KGs: 1) analogies of the same relations, which can be directly extracted from the KGs, and 2) analogies of analogous relations, which are identified with a selection and filtering pipeline enabled by large language models (LLMs), followed by minor human efforts for data quality control. Evaluations on a series of datasets of two analogical reasoning tasks (analogy recognition and generation) demonstrate that ANALOGYKB successfully enables both smaller LMs and LLMs to gain better analogical reasoning capabilities. Resources of this paper can be found at https://github.com/siyuyuan/analogykb. © 2024 Association for Computational Linguistics. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#91
-
Yuan 2023
Causality-aware Concept Extraction based on Knowledge-guided Prompting
61st Annual Meeting of the the Association-for-Computational-Linguistics (ACL) 2023;():9255-9272 Toronto, CANADA Assoc Computational Linguistics-Acl 2023 Ref ID: 3394 Concepts benefit natural language understanding but are far from complete in existing knowledge graphs (KGs). Recently, pre-trained language models (PLMs) have been widely used in text-based concept extraction (CE). However, PLMs tend to mine the co-occurrence associations from massive corpus as pre-trained knowledge rather than the real causal effect between tokens. As a result, the pre-trained knowledge confounds PLMs to extract biased concepts based on spurious co-occurrence correlations, inevitably resulting in low precision. In this paper, through the lens of a Structural Causal Model (SCM), we propose equipping the PLM-based extractor with a knowledge-guided prompt as an intervention to alleviate concept bias. The prompt adopts the topic of the given entity from the existing knowledge in KGs to mitigate the spurious co-occurrence correlations between entities and biased concepts. Our extensive experiments on representative multilingual KG datasets justify that our proposed prompt can effectively alleviate concept bias and improve the performance of PLM-based CE models. The code has been released on https://github.com/ siyuyuan/KPCE. |
Davis
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3713
-
Yun 2024
MissionGNN: Hierarchical Multimodal GNN-based Weakly Supervised Video Anomaly Recognition with Mission-Specific Knowledge Graph Generation
arXiv 2024;(): 2024 Ref ID: 8428 In the context of escalating safety concerns across various domains, the tasks of Video Anomaly Detection (VAD) and Video Anomaly Recognition (VAR) have emerged as critically important for applications in intelligent surveillance, evidence investigation, violence alerting, etc. These tasks, aimed at identifying and classifying deviations from normal behavior in video data, face significant challenges due to the rarity of anomalies which leads to extremely imbalanced data and the impracticality of extensive frame-level data annotation for supervised learning. This paper introduces a novel hierarchical graph neural network (GNN) based model MissionGNN that addresses these challenges by leveraging a state-of-the-art large language model and a comprehensive knowledge graph for efficient weakly supervised learning in VAR. Our approach circumvents the limitations of previous methods by avoiding heavy gradient computations on large multimodal models and enabling fully frame-level training without fixed video segmentation. Utilizing automated, mission-specific knowledge graph generation, our model provides a practical and efficient solution for real-time video analysis without the constraints of previous segmentation-based or multimodal approaches. Experimental validation on benchmark datasets demonstrates our model's performance in VAD and VAR, highlighting its potential to redefine the landscape of anomaly detection and recognition in video surveillance systems. |
Ishan
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1390
-
Yunqiu 2022
Identifying Named Entities of Chinese Electronic Medical Records Based on RoBERTa-wwm Dynamic Fusion Model
Data. Anal. Knowl. Discov. 2022;6(2-3):242-250 2022 DOI: 10.11925/infotech.2096-3467.2021.0951 · Ref ID: 5304 [Objective] This paper proposes an entity recognition model based on RoBERTa-wwm dynamic fusion, aiming to improve the entity identification of Chinese electronic medical records. [Methods] First, we merged the semantic representations generated by each Transformer layer of the pre-trained language model RoBERTa-wwm. Then, we input the bi-directional long short-term memory network and the conditional random field module to recognize the entities of the electronic medical records. [Results] We examined our new model with the dataset of“2017 National Knowledge Graph and Semantic Computing Conference (CCKS 2017)”and self-annotated electronic medical records. Their F1 values reached 94.08% and 90.08%, which were 0.23% and 0.39% higher than the RoBERTa-wwm-BiLSTM-CRF model. [Limitations] The RoBERTa-wwm used in this paper completed the pre-training process with non-medical corpus. [Conclusions] The proposed method could improve the results of entity recognition tasks. © 2022, Chinese Academy of Sciences. All rights reserved. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2354
-
Yurin 2018
The domain-specific editor for rule-based knowledge bases
2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO) 2018;():0961-0966 2018 DOI: 10.23919/MIPRO.2018.8400176 · Ref ID: 6039 The aim of the paper is to describe a domain-specific editor for the design of rule-based knowledge bases in the field of the prognosis of technical conditions and remaining operation time of petrochemical equipments. The architecture, main functions and a structure of files for configuration of the editor are presented. The feature of the editor is a semantic layer in the form of a platform-independent model. This layer provides to configure the editor with the account of features of a subjects domain. The semantic layer is implemented as a set of domain specific templates describing facts and rules (cause-and-effect relationships). These templates help to abstract from the syntax of certain knowledge representation languages (programming languages for knowledge bases, in particular, CLIPS - C Language Integrated Production System) and generate the graphic user interface elements. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#414
-
Zafar 2024
KIMedQA: towards building knowledge-enhanced medical QA models
Medical question-answering systems require the ability to extract accurate, concise, and comprehensive answers. They will better comprehend the complex text and produce helpful answers if they can reason on the explicit constraints described in the question's textual context and the implicit, pertinent knowledge of the medical world. Integrating Knowledge Graphs (KG) with Language Models (LMs) is a common approach to incorporating structured information sources. However, effectively combining and reasoning over KG representations and language context remains an open question. To address this, we propose the Knowledge Infused Medical Question Answering system (KIMedQA), which employs two techniques viz. relevant knowledge graph selection and pruning of the large-scale graph to handle Vector Space Inconsistent (VSI) and Excessive Knowledge Information (EKI). The representation of the query and context are then combined with the pruned knowledge network using a pre-trained language model to generate an informed answer. Finally, we demonstrate through in-depth empirical evaluation that our suggested strategy provides cutting-edge outcomes on two benchmark datasets, namely MASH-QA and COVID-QA. We also compared our results to ChatGPT, a robust and very powerful generative model, and discovered that our model outperforms ChatGPT according to the F1 Score and human evaluation metrics such as adequacy. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#594
-
Zahera 2022
MULTPAX: Keyphrase Extraction Using Language Models and Knowledge Graphs
21st International Semantic Web Conference (ISWC) 2022;13489():303-318 Electr Network Springer International Publishing Ag 2022 DOI: 10.1007/978-3-031-19433-7_18 · Ref ID: 2965 Keyphrase extraction aims to identify a small set of phrases that best describe the content of text. The automatic generation of keyphrases has become essential for many natural language applications such as text categorization, indexing, and summarization. In this paper, we propose MULTPAX, a multitask framework for extracting present and absent keyphrases using pre-trained language models and knowledge graphs. In particular, our framework contains three components: first, MULTPAX identifies present keyphrases from an input document. Then, MULTPAX links with external knowledge graphs to get more relevant phrases. Finally, MULTPAX ranks the extracted phrases based on their semantic relatedness to the input document and return top-k phrases as a final output. We conducted several experiments on four benchmark datasets to evaluate the performance of MULTPAX against different state-of-the-art baselines. The evaluation results demonstrate that our approach significantly outperforms the state-of-the-art baselines, with a significance t-test p < 0.041. Our source code and datasets are public available at https://github.com/dice-group/MultPAX. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3425
-
Zavarella 2024
A Few-Shot Approach for Relation Extraction Domain Adaptation using Large Language Models
arXiv 2024;(): 2024 Ref ID: 8507 Knowledge graphs (KGs) have been successfully applied to the analysis of complex scientific and technological domains, with automatic KG generation methods typically building upon relation extraction models capturing fine-grained relations between domain entities in text. While these relations are fully applicable across scientific areas, existing models are trained on few domain-specific datasets such as SciERC and do not perform well on new target domains. In this paper, we experiment with leveraging in-context learning capabilities of Large Language Models to perform schema-constrained data annotation, collecting in-domain training instances for a Transformer-based relation extraction model deployed on titles and abstracts of research papers in the Architecture, Construction, Engineering and Operations (AECO) domain. By assessing the performance gain with respect to a baseline Deep Learning architecture trained on off-domain data, we show that by using a few-shot learning strategy with structured prompts and only minimal expert annotation the presented approach can potentially support domain adaptation of a science KG generation model. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2042
-
Zeng 2024
XLORE 3: A Large-Scale Multilingual Knowledge Graph from Heterogeneous Wiki Knowledge Resources
In recent years, knowledge graph (KG) has attracted significant attention from academia and industry, resulting in the development of numerous technologies for KG construction, completion, and application. XLORE is one of the largest multilingual KGs built from Baidu Baike and Wikipedia via a series of knowledge modeling and acquisition methods. In this article, we utilize systematic methods to improve XLORE's data quality and present its latest version, XLORE 3, which enables the effective integration and management of heterogeneous knowledge from diverse resources. Compared with previous versions, XLORE 3 has three major advantages: (1) We design a comprehensive and reasonable schema, namely XLORE ontology, which can effectively organize and manage entities from various resources. (2) We merge equivalent entities in different languages to facilitate knowledge sharing. We provide a large-scale entity linking system to establish the associations between unstructured text and structured KG. (3) We design a multi-strategy knowledge completion framework, which leverages pre-trained language models and vast amounts of unstructured text to discover missing and new facts. The resulting KG contains 446 concepts, 2,608 properties, 66 million entities, and more than 2 billion facts. It is available and downloadable online at https://www.xlore.cn/, providing a valuable resource for researchers and practitioners in various fields. © 2024 Copyright held by the owner/author(s). |
Mike
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3348
-
Zeng 2023
Domain Knowledge Graph Construction Via A Simple Checker
arXiv 2023;(): 2023 Ref ID: 7875 With the availability of large language models, there is a growing interest for semiconductor chip design companies to leverage the technologies. For those companies, deployment of a new methodology must include two important considerations: confidentiality and scalability. In this context, this work tackles the problem of knowledge graph construction from hardware-design domain texts. We propose an oracle-checker scheme to leverage the power of GPT3.5 and demonstrate that the essence of the problem is in distillation of domain expert's background knowledge. Using RISC-V unprivileged ISA specification as an example, we explain key ideas and discuss practicality of our proposed oracle-checker approach. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3693
-
Zha 2023
M²ConceptBase: A Fine-Grained Aligned Concept-Centric Multimodal Knowledge Base
arXiv 2023;(): 2023 Ref ID: 7990 Multimodal knowledge bases (MMKBs) provide cross-modal aligned knowledge crucial for multimodal tasks. However, the images in existing MMKBs are generally collected for entities in encyclopedia knowledge graphs. Therefore, detailed groundings of visual semantics with linguistic concepts are lacking, which are essential for the visual concept cognition ability of multimodal models. Addressing this gap, we introduce M²ConceptBase, the first concept-centric MMKB. M²ConceptBase models concepts as nodes with associated images and detailed textual descriptions. We propose a context-aware multimodal symbol grounding approach to align concept-image and concept-description pairs using context information from image-text datasets. Comprising 951K images and 152K concepts, M²ConceptBase links each concept to an average of 6.27 images and a single description, ensuring comprehensive visual and textual semantics. Human studies confirm more than 95% alignment accuracy, underscoring its quality. Additionally, our experiments demonstrate that M²ConceptBase significantly enhances VQA model performance on the OK-VQA task. M²ConceptBase also substantially improves the fine-grained concept understanding capabilities of multimodal large language models through retrieval augmentation in two concept-related tasks, highlighting its value. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3926
-
Zhai 2023
Towards Faithful Knowledge Graph Explanation Through Deep Alignment in Commonsense Question Answering
arXiv 2023;(): 2023 Ref ID: 7874 The fusion of language models (LMs) and knowledge graphs (KGs) is widely used in commonsense question answering, but generating faithful explanations remains challenging. Current methods often overlook path decoding faithfulness, leading to divergence between graph encoder outputs and model predictions. We identify confounding effects and LM-KG misalignment as key factors causing spurious explanations. To address this, we introduce the LM-KG Fidelity metric to assess KG representation reliability and propose the LM-KG Distribution-aware Alignment (????) algorithm to improve explanation faithfulness. Without ground truth, we evaluate KG explanations using the proposed Fidelity-Sparsity Trade-off Curve. Experiments on CommonsenseQA and OpenBookQA show that LKDA significantly enhances explanation fidelity and model performance, highlighting the need to address distributional misalignment for reliable commonsense reasoning. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3416
-
Zhang 2024
Extract, Define, Canonicalize: An LLM-based Framework for Knowledge Graph Construction
arXiv 2024;(): 2024 Ref ID: 8219 In this work, we are interested in automated methods for knowledge graph creation (KGC) from input text. Progress on large language models (LLMs) has prompted a series of recent works applying them to KGC, e.g., via zero/few-shot prompting. Despite successes on small domain-specific datasets, these models face difficulties scaling up to text common in many real-world applications. A principal issue is that, in prior methods, the KG schema has to be included in the LLM prompt to generate valid triplets; larger and more complex schemas easily exceed the LLMs' context window length. Furthermore, there are scenarios where a fixed pre-defined schema is not available and we would like the method to construct a high-quality KG with a succinct self-generated schema. To address these problems, we propose a three-phase framework named Extract-Define-Canonicalize (EDC): open information extraction followed by schema definition and post-hoc canonicalization. EDC is flexible in that it can be applied to settings where a pre-defined target schema is available and when it is not; in the latter case, it constructs a schema automatically and applies self-canonicalization. To further improve performance, we introduce a trained component that retrieves schema elements relevant to the input text; this improves the LLMs' extraction performance in a retrieval-augmented generation-like manner. We demonstrate on three KGC benchmarks that EDC is able to extract high-quality triplets without any parameter tuning and with significantly larger schemas compared to prior works. Code for EDC is available at https://github.com/clear-nus/edc. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#582
-
Zhang 2023
Multi-Faceted Knowledge-Driven Pre-Training for Product Representation Learning
IEEE Trans. Knowl. Data Eng. 2023;35(7):7239-7250 2023 DOI: 10.1109/tkde.2022.3200921 · Ref ID: 3752 As a key component of e-commerce computing, product representation learning (PRL) provides benefits for a variety of applications, including product matching, search, and categorization. The existing PRL approaches have poor language understanding ability due to their inability to capture contextualized semantics. In addition, the learned representations by existing methods are not easily transferable to new products. Inspired by the recent advance of pre-trained language models (PLMs), we make the attempt to adapt PLMs for PRL to mitigate the above issues. In this article, we develop KINDLE, a Knowledge-drIven pre-trainiNg framework for proDuct representation LEarning, which can preserve the contextual semantics and multi-faceted product knowledge robustly and flexibly. Specifically, we first extend traditional one-stage pre-training to a two-stage pre-training framework, and exploit a deliberate knowledge encoder to ensure a smooth knowledge fusion into PLM. In addition, we propose a multi-objective heterogeneous embedding method to represent thousands of knowledge elements. This helps KINDLE calibrate knowledge noise and sparsity automatically by replacing isolated classes as training targets in knowledge acquisition tasks. Furthermore, an input-aware gating network is proposed to select the most relevant knowledge for different downstream tasks. Finally, extensive experiments have demonstrated the advantages of KINDLE over the state-of-the-art baselines across three downstream tasks. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#774
-
Zhang 2024
SimRE: Simple contrastive learning with soft logical rule for knowledge graph embedding
Knowledge graphs serve as a pivotal framework for the structured representation of information regarding entities and relations. However, in the real world, these knowledge graphs are often incomplete and harboring missing facts. Knowledge graph completion (KGC) has emerged as a central research focus, entailing the automated prediction of these missing facts and garnering substantial scholarly attention in recent years. Text -based knowledge graph embedding methods have demonstrated considerable potential for tackling the challenges associated with KGC by employing pre -trained language models. However, their limitation lies in the lack of logical features, which constrains the efficacy of capturing intricate patterns within knowledge graphs. This paper proposed SimRE, a straightforward contrastive learning framework augmented with soft logic rules. SimRE introduces a self -supervised framework that leverages the input rule bodies to predict the corresponding rule heads through a contrastive objective. We introduced two rule sampling techniques to enhance the efficiency and accuracy of the model: in -batch rule negatives and pre -batch rule negatives. SimRE employs a simple method for integrating logical features with the text -based model. The experimental results on benchmark datasets demonstrate that the proposed approach outperforms state-of-the-art methods. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1552
-
Zhang 2000
Language model for multilingual natural language generation
Shanghai Jiaotong Daxue Xuebao 2000;34(7):944-947 2000 Ref ID: 5828 This paper introduced a knowledge representation model for multilingual natural language text generation system. It can be divided into two levels: semantic resources and syntax resources. The former is used in describing text content by schema and optimized rules; the latter is used in constructing sentence pattern, mapping language resource and determining text specific form according to the sentence structure class, syntactic rules and lexical information. The model is based on a complex feature set. It can be used to extend the abstract initial semantic data to all kinds of language resource for multilingual text generation. |
Kwesi
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3520
-
Zhang 2024
Improving Sample Efficiency of Reinforcement Learning with Background Knowledge from Large Language Models
arXiv 2024;(): 2024 Ref ID: 8446 Low sample efficiency is an enduring challenge of reinforcement learning (RL). With the advent of versatile large language models (LLMs), recent works impart common-sense knowledge to accelerate policy learning for RL processes. However, we note that such guidance is often tailored for one specific task but loses generalizability. In this paper, we introduce a framework that harnesses LLMs to extract background knowledge of an environment, which contains general understandings of the entire environment, making various downstream RL tasks benefit from one-time knowledge representation. We ground LLMs by feeding a few pre-collected experiences and requesting them to delineate background knowledge of the environment. Afterward, we represent the output knowledge as potential functions for potential-based reward shaping, which has a good property for maintaining policy optimality from task rewards. We instantiate three variants to prompt LLMs for background knowledge, including writing code, annotating preferences, and assigning goals. Our experiments show that these methods achieve significant sample efficiency improvements in a spectrum of downstream tasks from Minigrid and Crafter domains. |
Mike
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#861
-
Zhang 2023
User-Centric Conversational Recommendation: Adapting the Need of User with Large Language Models
17th ACM Conference on Recommender Systems (RecSys) 2023;():1349-1354 Singapore, SINGAPORE Assoc Computing Machinery 2023 DOI: 10.1145/3604915.3608885 · Ref ID: 3327 Conversational recommender systems (CRS) promise to provide a more natural user experience for exploring and discovering items of interest through ongoing conversation. However, effectively modeling and adapting to users' complex and changing preferences remains challenging. This research develops user-centric methods that focus on understanding and adapting to users throughout conversations to provide the most helpful recommendations. First, a graph-based Conversational Path Reasoning (CPR) framework is proposed that represents dialogs as interactive reasoning over a knowledge graph to capture nuanced user interests and explain recommendations. To further enhance relationship modeling, graph neural networks are incorporated for improved representation learning. Next, to address uncertainty in user needs, the Vague Preference Multi-round Conversational Recommendation (VPMCR) scenario and matching Adaptive Vague Preference Policy Learning (AVPPL) solution are presented using reinforcement learning to tailor recommendations to evolving preferences. Finally, opportunities to leverage large language models are discussed to further advance user experiences via advanced user modeling, policy learning, and response generation. Overall, this research focuses on designing conversational recommender systems that continuously understand and adapt to users' ambiguous, complex and changing needs during natural conversations. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1421
-
Zhang 2023
Integrating Automated Knowledge Extraction with Large Language Models for Explainable Medical Decision-Making
Proceedings - 2023 2023 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2023 2023;():1710-1717 Institute of Electrical and Electronics Engineers Inc. 2023 DOI: 10.1109/BIBM58861.2023.10385557 · Ref ID: 4985 Large language models (LLMs) have demonstrated strong reasoning ability and inspired many previously unimaginable applications. In this paper, we aim to harness the strong reasoning capability of LLMs toward explainable medical diagnosis. As we know, deep learning has been widely adopted and shown improvement in medical diagnostics. However, it is often criticized for its lack of interpretability. To address this drawback, we propose the first method that innovatively combines Markov logic networks (MLNs) with external knowledge extracted using LLMs, aiming for improved both interpretability and accuracy. Specifically, our approach involves a new process, powered by LLMs and a search engine, to automatically collect and organize external medical knowledge. The outcome is a set of first-order logic (FOL) rules, which then become a key component for the following MLN-based diagnostic algorithm. The resulting MLN-based model can maintain the accuracy of deep networks while providing understandable reasoning for its decisions. By aiming to blend specific knowledge from the medical domain with LLM techniques, our work contributes towards the development of improved automatic diagnosis systems, with the potential for enhancing transparency and trust in medical diagnostics. © 2023 IEEE. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3241
-
Zhang 2023
CADGE: Context-Aware Dialogue Generation Enhanced with Graph-Structured Knowledge Aggregation
arXiv 2023;(): 2023 Ref ID: 7700 Commonsense knowledge is crucial to many natural language processing tasks. Existing works usually incorporate graph knowledge with conventional graph neural networks (GNNs), resulting in a sequential pipeline that compartmentalizes the encoding processes for textual and graph-based knowledge. This compartmentalization does, however, not fully exploit the contextual interplay between these two types of input knowledge. In this paper, a novel context-aware graph-attention model (Context-aware GAT) is proposed, designed to effectively assimilate global features from relevant knowledge graphs through a context-enhanced knowledge aggregation mechanism. Specifically, the proposed framework employs an innovative approach to representation learning that harmonizes heterogeneous features by amalgamating flattened graph knowledge with text data. The hierarchical application of graph knowledge aggregation within connected subgraphs, complemented by contextual information, to bolster the generation of commonsense-driven dialogues is analyzed. Empirical results demonstrate that our framework outperforms conventional GNN-based language models in terms of performance. Both, automated and human evaluations affirm the significant performance enhancements achieved by our proposed model over the concept flow baseline. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1547
-
Zhang 2024
LA-UCL: LLM-Augmented Unsupervised Contrastive Learning Framework for Few-Shot Text Classification
2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():10198-10207 European Language Resources Association (ELRA) 2024 Ref ID: 4597 The few-shot tasks require the model to have the ability to generalize from a few samples. However, due to the lack of cognitive ability, the current works cannot fully utilize limited samples to expand the sample space and still suffer from overfitting issues. To address the problems, we propose a LLM-Augmented Unsupervised Contrastive Learning Framework (LA-UCL), which introduces a cognition-enabled Large Language Model (LLM) for efficient data augmentation, and presents corresponding contrastive learning strategies. Specifically, in the self-augmented contrastive learning module, we construct a retrieval-based in-context prompt scheme by retrieving similar but different category data from the original samples, guiding the LLM to generate more discriminative augmented data. Then, by designing group-level contrastive loss to enhance the model's discriminative ability. In the external-augmented contrastive learning module, we utilize web knowledge retrieval to expand the sample space and leverage LLM to generate more diverse data, and introduce sample-level contrastive loss for unlabeled data to improve the model's generalization. Experimental results on six datasets show that our model exceeds the baseline models. © 2024 ELRA Language Resource Association: CC BY-NC 4.0. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1452
-
Zhang 2024
A Joint Method for Combat Intent Recognition and Key Information Extraction
Communications in Computer and Information Science 2024;2018 CCIS():115-125 Springer Science and Business Media Deutschland GmbH 2024 DOI: 10.1007/978-981-97-0844-4_9 · Ref ID: 4728 To alleviate the problems of poor quality and low efficiency in traditional combat plan making, we propose an intelligent combat plan generation method based on Bert pre-trained language model. First, we studied practical combat scenarios and military related websites, and constructed a military domain combat intent dataset that includes structured information such as combat categories, objects, and scenarios. Second, we utilize Bert pre-trained language model for semantic analysis of requirements, TextCNN (Convolutional Neural Network for Text) for combat intent recognition, and BiLSTM (Bidirectional Long Short-Term Memory) for key information extracting and entity normalization. Thus, based on the intent and key information, candidate schemes can be retrieved from the knowledge graph in the field of military operations in the future. Compared with traditional methods, the scheme quality and generation efficiency are significantly improved. This study provides an effective approach for intelligent decision support in the military field, and also offers references for intelligent scheme generation in other domains. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#422
-
Zhang 2022
A knowledge extraction framework for domain-specific application with simplified pre-trained language model and attention-based feature extractor
With the advancement of industrial informatics, intelligent algorithms are increasingly applied in various industrial products and applications. In this paper, we proposed a knowledge extraction framework for domain-specific text. This framework can extract entities from text the subsequent tasks such as knowledge graph construction. The proposed framework contains three modules, namely domain feature pre-trained model, LSTM-based named entity recognition and the attention-based nested named entity recognition. The domain feature pre-trained model can effectively learn the features of domain corpus such as professional terms that are not included in the general domain corpus. Flat named entity recognition can use the vector from pre-trained model to obtain the entity from domain-specific text. The nested named entity recognition based on the attention mechanism and the weight sliding balance strategy can effectively identify entity types with higher nesting rates. The framework achieves good results in the field of nuclear power plant maintenance reports, and the methods for domain pre-trained model and LSTM-based flat named entity recognition have been successfully applied to practical tasks. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#286
-
Zhang 2021
A Framework for Effective Knowledge Extraction from A Data Space Formed by Unstructured Technical Reports using Pre-trained Models
17th IEEE International Conference on E-Business Engineering (ICEBE) 2021;():120-125 S China Univ Technol, Guangzhou, PEOPLES R CHINA Ieee 2021 DOI: 10.1109/icebe52470.2021.00028 · Ref ID: 3217 The transformation of unstructured data into triples is a key task in knowledge graph construction. It remains a great challenge to complete this task on technical reports. In this work, we propose a framework for effectively structuring data structuring in knowledge graph construction from a data space formed by technical reports. This framework specifically consist of two pre-trained language models to provide the embeddings and a sequence labeling model to tag the entity labels. The pre-trained models, i.e. the Flair embedding and the BERT model, are employed to combine the output vectors to downstream tasks. To evaluate the proposed method, we conduct named entity recognition experiments using the status reports of complex equipment in nuclear power plants. The evaluation shows the framework achieves remarkable improvement on F1 score. This paper details the framework, the experiments, and the evaluation of the proposed method. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1837
-
Zhang 2021
RoKGDS: A Robust Knowledge Grounded Dialog System
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2021;13029 LNAI():377-387 Springer Science and Business Media Deutschland GmbH 2021 DOI: 10.1007/978-3-030-88483-3_30 · Ref ID: 5632 In this paper, we propose a pre-training based Robust Know-ledge Grounded Dialog System (RoKGDS) to enhance the performance of the model in unknown scenarios, which is easily generalized to various knowledge grounded dialog tasks, such as persona dialog, knowledge dialog, recommendation dialog. We use a bucket encoder to efficiently extract all kinds of knowledge information (e.g. profile, knowledge graph, and dialog goal). To improve the robustness of the model, we develop a hybrid decoder with a hybrid attention and a copy mechanism. The hybrid attention is an adaptation scheme to apply the pre-trained language model to our model and the copy mechanism is a gate mechanism to control generating a word from generic vocabulary or the input knowledge. Experiments show that our model is more robust than the other baseline models. Furthermore, we use visualization to explain the effectiveness of the hybrid attention compared to other two adaptation schemes. In the 2021 Language and Intelligence Challenge: Multi-Skill Dialog task, our best model ranked 3rd in the automatic evaluation stage and 5th in the human evaluation stage. © 2021, Springer Nature Switzerland AG. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1066
-
Zhang 2024
ChatScene: Knowledge-Enabled Safety-Critical Scenario Generation for Autonomous Vehicles
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2024;():15459-15469 IEEE Computer Society 2024 DOI: 10.1109/CVPR52733.2024.01464 · Ref ID: 4108 We present ChatScene, a Large Language Model (LLM)-based agent that leverages the capabilities of LLMs to gener-ate safety-critical scenarios for autonomous vehicles. Given unstructured language instructions, the agent first generates textually described traffic scenarios using LLMs. These scenario descriptions are subsequently broken down into several sub-descriptions for specified details such as behaviors and locations of vehicles. The agent then distinctively transforms the textually described sub-scenarios into domain-specific languages, which then generate actual code for prediction and control in simulators, facilitating the creation of diverse and complex scenarios within the CARLA simulation environment. A key part of our agent is a comprehensive knowledge retrieval component, which efficiently translates specific textual descriptions into corresponding domain-specific code snippets by training a knowledge database containing the scenario description and code pairs. Extensive experimental results underscore the efficacy of ChatScene in improving the safety of autonomous vehicles. For instance, the scenarios generated by ChatScene show a 15% increase in collision rates compared to state-of-the-art baselines when tested against different reinforcement learning-based ego vehicles. Furthermore, we show that by using our generated safety-critical scenarios to fine-tune different RL-based autonomous driving models, they can achieve a 9% reduction in collision rates, surpassing current SOTA methods. ChatScene effectively bridges the gap between textual descriptions of traffic scenarios and practical CARLA simulations, providing a unified way to conveniently generate safety-critical scenarios for safety testing and improvement for AVs. The code is available at https://github.com/javyduck/ChatScene. © 2024 IEEE. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1611
-
Zhang 2023
A LLM-Based Simulation Scenario Aided Generation Method
ITOEC 2023 - IEEE 7th Information Technology and Mechatronics Engineering Conference 2023;():1350-1354 Institute of Electrical and Electronics Engineers Inc. 2023 DOI: 10.1109/ITOEC57671.2023.10291525 · Ref ID: 5020 In the simulation training system, the generation of simulation scenarios is a basic problem that needs to be studied. Firstly, expounds on the technical characteristics of LLM and knowledge graph; then structurally describe the simulation scenario related content, and build scenario knowledge graph; according to the characteristics of scenario aided generation, a simulation scenario generation method based on LLM is proposed, which uses prompt to fuse knowledge graph and LLM, next, the implementation steps of this method were elaborated; finally, the specific application proves that the method proposed in this paper is a good reference for the generation of simulation scenarios. © 2023 IEEE. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3968
-
Zhang 2022
Utilizing Background Knowledge for Robust Reasoning over Traffic Situations
arXiv 2022;(): 2022 Ref ID: 7627 Understanding novel situations in the traffic domain requires an intricate combination of domain-specific and causal commonsense knowledge. Prior work has provided sufficient perception-based modalities for traffic monitoring, in this paper, we focus on a complementary research aspect of Intelligent Transportation: traffic understanding. We scope our study to text-based methods and datasets given the abundant commonsense knowledge that can be extracted using language models from large corpus and knowledge graphs. We adopt three knowledge-driven approaches for zero-shot QA over traffic situations, based on prior natural language inference methods, commonsense models with knowledge graph self-supervision, and dense retriever-based models. We constructed two text-based multiple-choice question answering sets: BDD-QA for evaluating causal reasoning in the traffic domain and HDT-QA for measuring the possession of domain knowledge akin to human driving license tests. Among the methods, Unified-QA reaches the best performance on the BDD-QA dataset with the adaptation of multiple formats of question answers. Language models trained with inference information and commonsense knowledge are also good at predicting the cause and effect in the traffic domain but perform badly at answering human-driving QA sets. For such sets, DPR+Unified-QA performs the best due to its efficient knowledge extraction. |
Davis
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3580
-
Zhang 2024
KnowHalu: Hallucination Detection via Multi-Form Knowledge Based Factual Checking
arXiv 2024;(): 2024 Ref ID: 8215 This paper introduces KnowHalu, a novel approach for detecting hallucinations in text generated by large language models (LLMs), utilizing step-wise reasoning, multi-formulation query, multi-form knowledge for factual checking, and fusion-based detection mechanism. As LLMs are increasingly applied across various domains, ensuring that their outputs are not hallucinated is critical. Recognizing the limitations of existing approaches that either rely on the self-consistency check of LLMs or perform post-hoc fact-checking without considering the complexity of queries or the form of knowledge, KnowHalu proposes a two-phase process for hallucination detection. In the first phase, it identifies non-fabrication hallucinations–responses that, while factually correct, are irrelevant or non-specific to the query. The second phase, multi-form based factual checking, contains five key steps: reasoning and query decomposition, knowledge retrieval, knowledge optimization, judgment generation, and judgment aggregation. Our extensive evaluations demonstrate that KnowHalu significantly outperforms SOTA baselines in detecting hallucinations across diverse tasks, e.g., improving by 15.65% in QA tasks and 5.50% in summarization tasks, highlighting its effectiveness and versatility in detecting hallucinations in LLM-generated content. |
yuexi
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#4002
-
Zhang 2024
Zero-Shot Learning Over Large Output Spaces : Utilizing Indirect Knowledge Extraction from Large Language Models
arXiv 2024;(): 2024 Ref ID: 8381 Extreme Multi-label Learning (XMC) is a task that allocates the most relevant labels for an instance from a predefined label set. Extreme Zero-shot XMC (EZ-XMC) is a special setting of XMC wherein no supervision is provided; only the instances (raw text of the document) and the predetermined label set are given. The scenario is designed to address cold-start problems in categorization and recommendation. Traditional state-of-the-art methods extract pseudo labels from the document title or segments. These labels from the document are used to train a zero-shot bi-encoder model. The main issue with these generated labels is their misalignment with the tagging task. In this work, we propose a framework to train a small bi-encoder model via the feedback from the large language model (LLM), the bi-encoder model encodes the document and labels into embeddings for retrieval. Our approach leverages the zero-shot ability of LLM to assess the correlation between labels and the document instead of using the low-quality labels extracted from the document itself. Our method also guarantees fast inference without the involvement of LLM. The performance of our approach outperforms the SOTA methods on various datasets while retaining a similar training time for large datasets. |
Ishan
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2128
-
Zhang 2024
The Application of Fine-Tuning on Pretrained Language Model in Information Extraction for Fault Knowledge Graphs
2024 9th International Conference on Intelligent Computing and Signal Processing (ICSP) 2024;():469-473 2024 DOI: 10.1109/ICSP62122.2024.10743881 · Ref ID: 6909 Constructing fault knowledge graphs holds significant importance for achieving intelligent maintenance and diagnosis in high-end equipment manufacturing. Effective information extraction and knowledge graph construction have proven challenging due to the lack of standardized representation of semantically complex unstructured text in the industrial domain. Therefore, in this study, we performed fine-tuning on the pre-trained language model (ChatGLM2-6B) with specific prompts to achieve information extraction from fault-related texts, ultimately leading to the construction of a fault knowledge graph. Experimental results demonstrate that the proposed method not only supports fine-tuning with limited data but also exhibits enhanced capability in understanding complex semantics related to fault symptoms and causes. |
Mike
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#9
-
Zhang 2024
Advancing building energy modeling with large language models: Exploration and case studies
The rapid progression in artificial intelligence has facilitated the emergence of large language models like ChatGPT, offering potential applications extending into specialized engineering modeling, especially physicsbased building energy modeling. This paper investigates the innovative integration of large language models with building energy modeling software, focusing specifically on the fusion of ChatGPT with EnergyPlus. A literature review is first conducted to reveal a growing trend of incorporating large language models in engineering modeling, albeit limited research on their application in building energy modeling. We underscore the potential of large language models in addressing building energy modeling challenges and outline potential applications including simulation input generation, simulation output analysis and visualization, conducting error analysis, co-simulation, simulation knowledge extraction and training, and simulation optimization. Three case studies reveal the transformative potential of large language models in automating and optimizing building energy modeling tasks, underscoring the pivotal role of artificial intelligence in advancing sustainable building practices and energy efficiency. The case studies demonstrate that selecting the right large language model techniques is essential to enhance performance and reduce engineering efforts. The findings advocate a multidisciplinary approach in future artificial intelligence research, with implications extending beyond building energy modeling to other specialized engineering modeling. |
yuexi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2127
-
Zhang 2009
Application of Decision-making Ontology for automotive body assembly design
2009 IEEE International Conference on Industrial Engineering and Engineering Management 2009;():764-768 2009 DOI: 10.1109/IEEM.2009.5372924 · Ref ID: 6468 Automotive body assembly design in conceptual design stage is a complex group decision-making process which involves knowledge communications and projects selections. Ontology is a useful tool to abstract vague and linguistic knowledge. In this paper, a project-handling module based decision-making ontology model (PMDOM) is proposed to accelerate communication of decision-making and help for the selection of projects. Based on PMDOM, an automotive body assembly domain ontology model is set up and the decision-making support system (DSS) has been implemented by using JAVA. It shows that the DSS with PMDOM is better than others without PMDOM. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3632
-
Zhang 2024
Large Language Models as Event Forecasters
arXiv 2024;(): 2024 Ref ID: 8386 Key elements of human events are extracted as quadruples that consist of subject, relation, object, and timestamp. This representation can be extended to a quintuple by adding a fifth element: a textual summary that briefly describes the event. These quadruples or quintuples, when organized within a specific domain, form a temporal knowledge graph (TKG). Current learning frameworks focus on a few TKG-related tasks, such as predicting an object given a subject and a relation or forecasting the occurrences of multiple types of events (i.e., relation) in the next time window. They typically rely on complex structural and sequential models like graph neural networks (GNNs) and recurrent neural networks (RNNs) to update intermediate embeddings. However, these methods often neglect the contextual information inherent in each quintuple, which can be effectively captured through concise textual descriptions. In this paper, we investigate how large language models (LLMs) can streamline the design of TKG learning frameworks while maintaining competitive accuracy in prediction and forecasting tasks. We develop multiple prompt templates to frame the object prediction (OP) task as a standard question-answering (QA) task, suitable for instruction fine-tuning with an encoder-decoder generative LLM. For multi-event forecasting (MEF), we design simple yet effective prompt templates for each TKG quintuple. This novel approach removes the need for GNNs and RNNs, instead utilizing an encoder-only LLM to generate fixed intermediate embeddings, which are subsequently processed by a prediction head with a self-attention mechanism to forecast potential future relations. Extensive experiments on multiple real-world datasets using various evaluation metrics validate the effectiveness and robustness of our approach. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1214
-
Zhang 2022
DRLK: Dynamic Hierarchical Reasoning with Language Model and Knowledge Graph for Question Answering
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 2022;():5123-5133 Association for Computational Linguistics (ACL) 2022 Ref ID: 5411 In recent years, Graph Neural Network (GNN) approaches with enhanced knowledge graphs (KG) perform well in question answering (QA) tasks. One critical challenge is how to effectively utilize interactions between the QA context and KG. However, existing work only adopts the identical QA context representation to interact with multiple layers of KG, which results in a restricted interaction. In this paper, we propose DRLK (Dynamic Hierarchical Reasoning with Language Model and Knowledge Graphs), a novel model that utilizes dynamic hierarchical interactions between the QA context and KG for reasoning. DRLK extracts dynamic hierarchical features in the QA context, and performs inter-layer and intra-layer interactions on each iteration, allowing the KG representation to be grounded with the hierarchical features of the QA context. We conduct extensive experiments on four benchmark datasets in medical QA and commonsense reasoning. The experimental results demonstrate that DRLK achieves state-of-the-art performances on two benchmark datasets and performs competitively on the others. © 2022 Association for Computational Linguistics. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1380
-
Zhang 2024
How Language Model Hallucinations Can Snowball
Proceedings of Machine Learning Research 2024;235():59670-59684 ML Research Press 2024 Ref ID: 4400 A major risk of using language models in practical applications is their tendency to hallucinate incorrect statements.Hallucinations are often attributed to knowledge gaps in LMs, but we show that LMs sometimes produce hallucinations that they can separately recognize as incorrect.To do this, we construct three question-answering datasets where LMs often state an incorrect answer which is followed by an explanation with at least one incorrect claim.Crucially, we find that GPT-3.5, GPT-4, and LLaMA2-70B-chat can identify 67%, 87%, and 94% of these incorrect claims, respectively.We show that this phenomenon doesn't disappear under higher temperatures sampling, beam search, and zero-shot chain-of-thought prompting.These findings reveal that LM hallucinations can snowball: early mistakes by an LM can lead to more mistakes that otherwise would not be made. Copyright 2024 by the author(s) |
Xinchen
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#424
-
Zhang 2024
Knowledge graph accuracy evaluation: an LLM-enhanced embedding approach
As an effective way for knowledge representation and knowledge storage, knowledge graph has been widely used in various fields. However, with the rapid increase of scale and volume of various knowledge graphs, there will inevitably be some knowledge quality matters. To evaluate the accuracy of knowledge graph effectively and efficiently, a common paradigm is to match the facts in knowledge graph with specific external knowledge. In this study, an LLM-enhanced (large language model enhanced) embedding framework is designed, integrating the verification ability of large language models to further evaluate the embedding results. First an optimized embedding model is proposed to make use of knowledge graph's internal structural information to measure whether the relation of a given triplet is probably founded. Then, the triplets which have less paths to support themselves are selected as the questionable ones, as their correctness cannot be determined confidently. Finally, the questionable triplets are filtered, and LLMs are adopted for further fact verification as external knowledge. The above three parts are aggregated to achieve the automated, accurate and efficient evaluation for knowledge graphs. |
Kwesi
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1494
-
Zhang 2022
Knowledge Collaborative Fine-tuning for Low-resource Knowledge Graph Completion
Knowledge graph completion can make the knowledge graph more complete. Unfortunately, most of existing methods on knowledge graph completion assume that the entities or relations in the knowledge graph have sufficient triple instances. Nevertheless, there are great deals of long-tail triple sin general domains. Furthermore, it is challenging to obtain a large amount of high-quality annotation data in vertical domains. To address these issues, a knowledge collaborative fine-tuning approach is proposed for low-resource knowledge graph completion. The structured knowledge is leveraged to construct the initial prompt template and the optimal templates, labels, and model parameters are learnt through a collaborative fine-tuning algorithm. The proposed method leverages the explicit structured knowledge in the knowledge graph and the implicit triple knowledge from the language model, which can be applied to the tasks of link prediction and relation extraction. Experimental results show that the proposed approach can obtain state-of-the-art performance on three knowledge graph reasoning datasets and five relation extraction datasets. © 2022 Chinese Academy of Sciences. All rights reserved. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2607
-
Zhang 2019
Knowledge Adaptive Neural Network for Natural Language Inference
2019 International Joint Conference on Neural Networks (IJCNN) 2019;():1-8 2019 DOI: 10.1109/IJCNN.2019.8851884 · Ref ID: 6246 Natural language inference (NLI) has received widespread attention in recent years due to its contribution to various natural language processing tasks, such as question answering, abstract text summarization, and video caption. Most existing works focus on modeling the sentence interaction information, while the use of commonsense knowledge is not well studied for NLI. In this paper, we propose knowledge adaptive neural network (KANN) that adaptively incorporates commonsense knowledge at sentence encoding and inference stages. We first perform knowledge collection and representation to identify the relevant knowledge. Then we use a knowledge absorption gate to embed knowledge into neural network models. Experiments on two benchmark datasets, namely SNLI and MultiNLI for natural language inference, show the advantages of our proposed model. Furthermore, our model is comparable to if not better than the recent neural network based approaches on NLI. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#285
-
Zhang 2023
ForensiQ: A Knowledge Graph Question Answering System for IoT Forensics
14th EAI International Conference on Digital Forensics and Cyber Crime (ICDF2C) 2023;571():300-314 New York, NY Springer International Publishing Ag 2023 DOI: 10.1007/978-3-031-56583-0_20 · Ref ID: 3046 The increasing number of attacks against the Internet of Things (IoT) has made IoT forensics critically important for reporting and mitigating cyber incidents and crimes. However, the heterogeneity of IoT environments and the complexity and volume of IoT data present significant challenges to forensic practitioners. The advent of question answering (QA) systems and large language models (LLM) offers a potential solution to accessing sophisticated IoT forensic knowledge and data. In light of this, we propose ForensiQ, a framework based on knowledge graph question answering (KGQA), to help investigators navigate complex IoT forensic artifacts and cybersecurity knowledge. Our framework integrates knowledge graphs (KG) into the IoT forensic workflow to better organize and analyze forensic artifacts. We also have developed a novel KGQA model that serves as a natural-language user interface to the IoT forensic KG. Our evaluation results show that, compared to existing KGQA models, ForensiQ demonstrates higher accuracy in answering natural language questions when applied to our experimental IoT forensic KG. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1729
-
Zhang 2024
PLaD: Preference-based Large Language Model Distillation with Pseudo-Preference Pairs
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():15623-15636 Association for Computational Linguistics (ACL) 2024 Ref ID: 4298 Large Language Models (LLMs) have exhibited impressive capabilities in various tasks, yet their vast parameter sizes restrict their applicability in resource-constrained settings. Knowledge distillation (KD) offers a viable solution by transferring expertise from large teacher models to compact student models. However, traditional KD techniques face specific challenges when applied to LLMs, including restricted access to LLM outputs, significant teacher-student capacity gaps, and the inherited mis-calibration issue. In this work, we present PLaD, a novel preference-based LLM distillation framework. PLaD exploits the teacher-student capacity discrepancy to generate pseudo-preference pairs where teacher outputs are preferred over student outputs. Then, PLaD leverages a ranking loss to re-calibrate student's estimation of sequence likelihood, which steers the student's focus towards understanding the relative quality of outputs instead of simply imitating the teacher. PLaD bypasses the need for access to teacher LLM's internal states, tackles the student's expressivity limitations, and mitigates the student mis-calibration issue. Through extensive experiments on two sequence generation tasks and with various LLMs, we demonstrate the effectiveness of our PLaD framework. © 2024 Association for Computational Linguistics. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#43
-
Zhang 2024
AutoAlign: Fully Automatic and Effective Knowledge Graph Alignment Enabled by Large Language Models
IEEE Trans. Knowl. Data Eng. 2024;36(6):2357-2371 2024 DOI: 10.1109/tkde.2023.3325484 · Ref ID: 3164 The task of entity alignment between knowledge graphs (KGs) aims to identify every pair of entities from two different KGs that represent the same entity. Many machine learning-based methods have been proposed for this task. However, to our best knowledge, existing methods all require manually crafted seed alignments, which are expensive to obtain. In this paper, we propose the first fully automatic alignment method named AutoAlign, which does not require any manually crafted seed alignments. Specifically, for predicate embeddings, AutoAlign constructs a predicate-proximity-graph with the help of large language models to automatically capture the similarity between predicates across two KGs. For entity embeddings, AutoAlign first computes the entity embeddings of each KG independently using TransE, and then shifts the two KGs' entity embeddings into the same vector space by computing the similarity between entities based on their attributes. Thus, both predicate alignment and entity alignment can be done without manually crafted seed alignments. AutoAlign is not only fully automatic, but also highly effective. Experiments using real-world KGs show that AutoAlign improves the performance of entity alignment significantly compared to state-of-the-art methods. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1524
-
Zhang 2023
Knowledge-Augmented Frame Semantic Parsing with Hybrid Prompt-Tuning
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings 2023;2023-June(): Institute of Electrical and Electronics Engineers Inc. 2023 DOI: 10.1109/ICASSP49357.2023.10095476 · Ref ID: 5160 Frame semantics-based approaches have been widely used in semantic parsing tasks and have become mainstream. It remains challenging to disambiguate frame representations evoked by target lexical units under different contexts. Pre-trained Language Models (PLMs) have been used in semantic parsing and significantly improve the accuracy of neural parsers. However, the PLMs-based approaches tend to favor collocated patterns presented in the training data, leading to inaccurate outcomes. The intuition here is to design a mechanism to optimally use knowledge captured in semantic frames in conjunction with PLMs to disambiguate frames. We propose a novel Knowledge-Augmented Frame Semantic Parsing Architecture (KAF-SPA) to enhance semantic representation by incorporating accurate frame knowledge into PLMs during frame semantic parsing. Specifically, a Memory-based Knowledge Extraction Module (MKEM) is devised to select accurate frame knowledge and construct the continuous templates in the high dimensional vector space. Moreover, we design a Task-oriented Knowledge Probing Module (TKPM) using hybrid prompts (in terms of continuous and discrete prompts) to incorporate the selected knowledge into the PLMs and adapt PLMs to the tasks of frame and argument identification. Experimental results on two public FrameNet datasets demonstrate that our method significantly outperforms strong baselines (by more than +3% in F1), achieving state-of-art results on the current benchmark. Ablation studies verify the effectiveness of KAF-SPA. © 2023 IEEE. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3821
-
Zhang 2022
REKnow: Enhanced Knowledge for Joint Entity and Relation Extraction
arXiv 2022;(): 2022 Ref ID: 7553 Relation extraction is an important but challenging task that aims to extract all hidden relational facts from the text. With the development of deep language models, relation extraction methods have achieved good performance on various benchmarks. However, we observe two shortcomings of previous methods: first, there is no unified framework that works well under various relation extraction settings; second, effectively utilizing external knowledge as background information is absent. In this work, we propose a knowledge-enhanced generative model to mitigate these two issues. Our generative model is a unified framework to sequentially generate relational triplets under various relation extraction settings and explicitly utilizes relevant knowledge from Knowledge Graph (KG) to resolve ambiguities. Our model achieves superior performance on multiple benchmarks and settings, including WebNLG, NYT10, and TACRED. |
Mike
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#336
-
Zhang 2021
HORNET: Enriching Pre-trained Language Representations with Heterogeneous Knowledge Sources
30th ACM International Conference on Information and Knowledge Management (CIKM) 2021;():2608-2617 Univ Queensland, ELECTR NETWORK Assoc Computing Machinery 2021 DOI: 10.1145/3459637.3482436 · Ref ID: 3070 Knowledge-Enhanced Pre-trained Language Models (KEPLMs) improve the language understanding abilities of deep language models by leveraging the rich semantic knowledge from knowledge graphs, other than plain pre-training texts. However, previous efforts mostly use homogeneous knowledge (especially structured relation triples in knowledge graphs) to enhance the context-aware representations of entity mentions, whose performance may be limited by the coverage of knowledge graphs. Also, it is unclear whether these KEPLMs truly understand the injected semantic knowledge due to the "blackbox" training mechanism. In this paper, we propose a novel KEPLM named HORNET, which integrates Heterogeneous knOwledge from various structured and unstructured sources into the Roberta NETwork and hence takes full advantage of both linguistic and factual knowledge simultaneously. Specifically, we design a hybrid attention heterogeneous graph convolution network (HaHGCN) to learn heterogeneous knowledge representations based on the structured relation triplets from knowledge graphs and the unstructured entity description texts. Meanwhile, we propose the explicit dual knowledge understanding tasks to help induce a more effective infusion of the heterogeneous knowledge, promoting our model for learning the complicated mappings from the knowledge graph embedding space to the deep context-aware embedding space and vice versa. Experiments show that our HORNET model outperforms various KEPLM baselines on knowledge-aware tasks including knowledge probing, entity typing and relation extraction. Our model also achieves substantial improvement over several GLUE benchmark datasets, compared to other KEPLMs. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1578
-
Zhang 2023
Learning Knowledge-Enhanced Contextual Language Representations for Domain Natural Language Understanding
EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings 2023;():15663-15676 Association for Computational Linguistics (ACL) 2023 Ref ID: 4988 Knowledge-Enhanced Pre-trained Language Models (KEPLMs) improve the performance of various downstream NLP tasks by injecting knowledge facts from large-scale Knowledge Graphs (KGs). However, existing methods for pre-training KEPLMs with relational triples are difficult to be adapted to close domains due to the lack of sufficient domain graph semantics. In this paper, we propose a Knowledgeenhanced lANGuAge Representation learning framework for various clOsed dOmains (KANGAROO) via capturing the implicit graph structure among the entities. Specifically, since the entity coverage rates of closed-domain KGs can be relatively low and may exhibit the global sparsity phenomenon for knowledge injection, we consider not only the shallow relational representations of triples but also the hyperbolic embeddings of deep hierarchical entity-class structures for effective knowledge fusion. Moreover, as two closed-domain entities under the same entity-class often have locally dense neighbor subgraphs counted by max point biconnected component, we further propose a data augmentation strategy based on contrastive learning over subgraphs to construct hard negative samples of higher quality. It makes the underlying KELPMs better distinguish the semantics of these neighboring entities to further complement the global semantic sparsity. In the experiments, we evaluate KANGAROO over various knowledge-aware and general NLP tasks in both full and few-shot learning settings, outperforming various KEPLM training paradigms performance in closed-domains significantly.. ©2023 Association for Computational Linguistics. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#723
-
Zhang 2022
Research on the Chinese Named-Entity-Relation-Extraction Method for Crop Diseases Based on BERT
In order to integrate fragmented text data of crop disease knowledge to solve the current problems of disordered knowledge management, weak correlation and difficulty in knowledge sharing, a Chinese named-entity-relation-extraction model for crop diseases (BBCPF) was proposed in this paper by utilizing the advantage of knowledge graph in describing complex relations between disease entities in a structured form. This model was composed of two parts, i.e., named-entity recognition and relation extraction, in the form of an assembly line. To deal with the different meanings of Chinese crop disease terms in different contexts and to better obtain the contextual information, the BERT model was introduced for dynamic vector representations. Then, the BiLSTM layer was used to learn long-distance text information, and the CRF was applied to obtain the globally optimal labeling sequence, so as to output the crop disease entities. According to the entity category, the entities were divided as subjects and objects, which were then input into the disordered language model PERT to extract the contextual features of the relation data. At last, the fully connected layer was used to decode the information and output the crop disease entity-relation triples. The experiment results show that, on the self-built disease corpus dataset, the Precision, Recall, and Fl-Score values of the established model reached 85.63%, 79.46% and 82.43%, respectively, for entity extraction, and reached 97.96%, 98.43% and 98.16%, respectively, for relation extraction. This paper provides an effective method for information extraction in the construction of Chinese crop disease domain knowledge graphs. |
Mike
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#279
-
Zhang 2024
Fine-tuning large language models for chemical text mining
Extracting knowledge from complex and diverse chemical texts is a pivotal task for both experimental and computational chemists. The task is still considered to be extremely challenging due to the complexity of the chemical language and scientific literature. This study explored the power of fine-tuned large language models (LLMs) on five intricate chemical text mining tasks: compound entity recognition, reaction role labelling, metal-organic framework (MOF) synthesis information extraction, nuclear magnetic resonance spectroscopy (NMR) data extraction, and the conversion of reaction paragraphs to action sequences. The fine-tuned LLMs demonstrated impressive performance, significantly reducing the need for repetitive and extensive prompt engineering experiments. For comparison, we guided ChatGPT (GPT-3.5-turbo) and GPT-4 with prompt engineering and fine-tuned GPT-3.5-turbo as well as other open-source LLMs such as Mistral, Llama3, Llama2, T5, and BART. The results showed that the fine-tuned ChatGPT models excelled in all tasks. They achieved exact accuracy levels ranging from 69% to 95% on these tasks with minimal annotated data. They even outperformed those task-adaptive pre-training and fine-tuning models that were based on a significantly larger amount of in-domain data. Notably, fine-tuned Mistral and Llama3 show competitive abilities. Given their versatility, robustness, and low-code capability, leveraging fine-tuned LLMs as flexible and effective toolkits for automated data acquisition could revolutionize chemical knowledge extraction. Extracting knowledge from complex chemical texts is essential for both experimental and computational chemists. Fine-tuned large language models (LLMs) can serve as flexible and effective extractors for automated data acquisition. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3946
-
Zhang 2024
TrustUQA: A Trustful Framework for Unified Structured Data Question Answering
arXiv 2024;(): 2024 Ref ID: 8429 Natural language question answering (QA) over structured data sources such as tables and knowledge graphs (KGs) have been widely investigated, for example with Large Language Models (LLMs). The main solutions include question to formal query parsing and retrieval-based answer generation. However, current methods of the former often suffer from weak generalization, failing to dealing with multiple sources simultaneously, while the later is limited in trustfulness. In this paper, we propose UnifiedTQA, a trustful QA framework that can simultaneously support multiple types of structured data in a unified way. To this end, it adopts an LLM-friendly and unified knowledge representation method called Condition Graph (CG), and uses an LLM and demonstration-based two-level method for CG querying. For enhancement, it is also equipped with dynamic demonstration retrieval. We have evaluated UnifiedTQA with 5 benchmarks covering 3 types of structured data. It outperforms 2 existing unified structured data QA methods and in comparison with the baselines that are specific to a data type, it achieves state-of-the-art on 2 of them. Further more, we demonstrates potential of our method for more general QA tasks, QA over mixed structured data and QA across structured data. |
Ishan
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1857
-
Zhang 2024
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;1():1946-1965 Association for Computational Linguistics (ACL) 2024 Ref ID: 4394 Despite showing impressive abilities, large language models (LLMs) often struggle with factual inaccuracies, i.e., “hallucinations”, even when they hold relevant knowledge. To mitigate these hallucinations, current approaches typically necessitate high-quality human factuality annotations. In this work, we explore Self-Alignment for Factuality, where we leverage the self-evaluation capability of an LLM to provide training signals that steer the model towards factuality. Specifically, we incorporate SELF-EVAL, a self-evaluation component, to prompt an LLM to validate the factuality of its own generated responses solely based on its internal knowledge. Additionally, we design Self-Knowledge Tuning (SK-TUNING) to augment the LLM's self-evaluation ability by improving the model's confidence estimation and calibration. We then utilize these self-annotated responses to fine-tune the model via Direct Preference Optimization algorithm. We show that the proposed self-alignment approach substantially enhances factual accuracy over LLAMA family models across three key knowledge-intensive tasks on TruthfulQA and BioGEN. © 2024 Association for Computational Linguistics. |
Kwesi
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#841
-
Zhang 2024
Traditional Chinese Medicine Knowledge Graph Construction Based on Large Language Models
This study explores the use of large language models in constructing a knowledge graph for Traditional Chinese Medicine (TCM) to improve the representation, storage, and application of TCM knowledge. The knowledge graph, based on a graph structure, effectively organizes entities, attributes, and relationships within the TCM domain. By leveraging large language models, we collected and embedded substantial TCM-related data, generating precise representations transformed into a knowledge graph format. Experimental evaluations confirmed the accuracy and effectiveness of the constructed graph, extracting various entities and their relationships, providing a solid foundation for TCM learning, research, and application. The knowledge graph has significant potential in TCM, aiding in teaching, disease diagnosis, treatment decisions, and contributing to TCM modernization. In conclusion, this paper utilizes large language models to construct a knowledge graph for TCM, offering a vital foundation for knowledge representation and application in the field, with potential for future expansion and refinement. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1544
-
Zhang 2024
KnowVrDU: A Unified Knowledge-aware Prompt-Tuning Framework for Visually-rich Document Understanding
2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings 2024;():9878-9889 European Language Resources Association (ELRA) 2024 Ref ID: 4601 In Visually-rich Document Understanding (VrDU), recent advances of incorporating layout and image features into the pre-training language models have achieved significant progress. Existing methods usually developed complicated dedicated architectures based on pre-trained models and fine-tuned them with costly high-quality data to eliminate the inconsistency of knowledge distribution between the pre-training task and specialized downstream tasks. However, due to their huge data demands, these methods are not suitable for few-shot settings, which are essential for quick applications with limited resources but few previous works are presented. To solve these problems, we propose a unified Knowledge-aware prompt-tuning framework for Visual-rich Document Understanding (KnowVrDU) to enable broad utilization for diverse concrete applications and reduce data requirements. To model heterogeneous VrDU structures without designing task-specific architectures, we propose to reformulate various VrDU tasks into a single question-answering format with task-specific prompts and train the pre-trained model with the parameter-efficient prompt tuning method. To bridge the knowledge gap between the pre-training task and specialized VrDU tasks without additional annotations, we propose a prompt knowledge integration mechanism to leverage external open-source knowledge bases. We conduct experiments on several benchmark datasets in few-shot settings and the results validate the effectiveness of our method. © 2024 ELRA Language Resource Association: CC BY-NC 4.0. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1602
-
Zhang 2024
Light Up the Shadows: Enhance Long-Tailed Entity Grounding with Concept-Guided Vision-Language Models
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2024;():13379-13389 Association for Computational Linguistics (ACL) 2024 Ref ID: 4291 Multi-Modal Knowledge Graphs (MMKGs) have proven valuable for various downstream tasks. However, scaling them up is challenging because building large-scale MMKGs often introduces mismatched images (i.e., noise). Most entities in KGs belong to the long tail, meaning there are few images of them available online. This scarcity makes it difficult to determine whether a found image matches the entity. To address this, we draw on the Triangle of Reference Theory and suggest enhancing vision-language models with concept guidance. Specifically, we introduce COG, a two-stage framework with COncept-Guided vision-language models. The framework comprises a CONCEPT INTEGRATION module, which effectively identifies image-text pairs of long-tailed entities, and an EVIDENCE FUSION module, which offers explainability and enables human verification. To demonstrate the effectiveness of COG, we create a dataset of 25k image-text pairs of long-tailed entities. Our comprehensive experiments show that COG not only improves the accuracy of recognizing long-tailed image-text pairs compared to baselines but also offers flexibility and explainability. © 2024 Association for Computational Linguistics. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3147
-
Zhang 2024
Making Large Language Models Perform Better in Knowledge Graph Completion
Proceedings of the 32nd ACM International Conference on Multimedia 2024;():233–242 Melbourne VIC, Australia Association for Computing Machinery 2024 DOI: 10.1145/3664647.3681327 · Ref ID: 7121 |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3255
-
Zhang 2024
Chain-of-Knowledge: Integrating Knowledge Reasoning into Large Language Models by Learning from Knowledge Graphs
arXiv 2024;(): 2024 Ref ID: 8436 Large Language Models (LLMs) have exhibited impressive proficiency in various natural language processing (NLP) tasks, which involve increasingly complex reasoning. Knowledge reasoning, a primary type of reasoning, aims at deriving new knowledge from existing one.While it has been widely studied in the context of knowledge graphs (KGs), knowledge reasoning in LLMs remains underexplored. In this paper, we introduce Chain-of-Knowledge, a comprehensive framework for knowledge reasoning, including methodologies for both dataset construction and model learning. For dataset construction, we create KnowReason via rule mining on KGs. For model learning, we observe rule overfitting induced by naive training. Hence, we enhance CoK with a trial-and-error mechanism that simulates the human process of internal knowledge exploration. We conduct extensive experiments with KnowReason. Our results show the effectiveness of CoK in refining LLMs in not only knowledge reasoning, but also general reasoning benchmarkms. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3197
-
Zhang 2024
AttacKG+:Boosting Attack Knowledge Graph Construction with Large Language Models
arXiv 2024;(): 2024 Ref ID: 8282 Attack knowledge graph construction seeks to convert textual cyber threat intelligence (CTI) reports into structured representations, portraying the evolutionary traces of cyber attacks. Even though previous research has proposed various methods to construct attack knowledge graphs, they generally suffer from limited generalization capability to diverse knowledge types as well as requirement of expertise in model design and tuning. Addressing these limitations, we seek to utilize Large Language Models (LLMs), which have achieved enormous success in a broad range of tasks given exceptional capabilities in both language understanding and zero-shot task fulfillment. Thus, we propose a fully automatic LLM-based framework to construct attack knowledge graphs named: AttacKG+. Our framework consists of four consecutive modules: rewriter, parser, identifier, and summarizer, each of which is implemented by instruction prompting and in-context learning empowered by LLMs. Furthermore, we upgrade the existing attack knowledge schema and propose a comprehensive version. We represent a cyber attack as a temporally unfolding event, each temporal step of which encapsulates three layers of representation, including behavior graph, MITRE TTP labels, and state summary. Extensive evaluation demonstrates that: 1) our formulation seamlessly satisfies the information needs in threat event analysis, 2) our construction framework is effective in faithfully and accurately extracting the information defined by AttacKG+, and 3) our attack graph directly benefits downstream security practices such as attack reconstruction. All the code and datasets will be released upon acceptance. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2071
-
Zhang 2023
Large-Scale Biomedical Relation Extraction Across Diverse Relation Types: Model Development and Usability Study on COVID-19
BACKGROUND: Biomedical relation extraction (RE) is of great importance for researchers to conduct systematic biomedical studies. It not only helps knowledge mining, such as knowledge graphs and novel knowledge discovery, but also promotes translational applications, such as clinical diagnosis, decision-making, and precision medicine. However, the relations between biomedical entities are complex and diverse, and comprehensive biomedical RE is not yet well established. OBJECTIVE: We aimed to investigate and improve large-scale RE with diverse relation types and conduct usability studies with application scenarios to optimize biomedical text mining. METHODS: Data sets containing 125 relation types with different entity semantic levels were constructed to evaluate the impact of entity semantic information on RE, and performance analysis was conducted on different model architectures and domain models. This study also proposed a continued pretraining strategy and integrated models with scripts into a tool. Furthermore, this study applied RE to the COVID-19 corpus with article topics and application scenarios of clinical interest to assess and demonstrate its biological interpretability and usability. RESULTS: The performance analysis revealed that RE achieves the best performance when the detailed semantic type is provided. For a single model, PubMedBERT with continued pretraining performed the best, with an F1-score of 0.8998. Usability studies on COVID-19 demonstrated the interpretability and usability of RE, and a relation graph database was constructed, which was used to reveal existing and novel drug paths with edge explanations. The models (including pretrained and fine-tuned models), integrated tool (Docker), and generated data (including the COVID-19 relation graph database and drug paths) have been made publicly available to the biomedical text mining community and clinical researchers. CONCLUSIONS: This study provided a comprehensive analysis of RE with diverse relation types. Optimized RE models and tools for diverse relation types were developed, which can be widely used in biomedical text mining. Our usability studies provided a proof-of-concept demonstration of how large-scale RE can be leveraged to facilitate novel research. |
Mike
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1744
-
Zhang 2023
Predicting Dynamic Relationship for Financial Knowledge Graph
Data. Anal. Knowl. Discov. 2023;7(9):39-50 2023 DOI: 10.11925/infotech.2096-3467.2022.0921 · Ref ID: 4885 [Objective] This paper proposes a data-driven prediction method for dynamic relationships, aiming to provide a new perspective for rapidly updating the financial knowledge graph. [Methods] First, we regularly crawled relevant information from the Internet according to the monitoring list. Then, we used the Mask Language Model to construct a dataset and train the model. Third, we extracted the hierarchical structure of the financial knowledge graph to build a hidden layer of the neural network. The neurons contained in the hidden layer represent named entities. Fourth, we connected the hidden layers by a relationship matrix and predicted the dynamic relationships by updating the connection matrix. [Results] We examined the proposed model with the two equity changes at the beginning of the“Baowan”event. Our new model quickly captured the changes in the relationship between corresponding entities of the financial graph in different periods. [Limitations] Due to the characteristics of unsupervised learning, the predicted relationship is relatively divergent, which requires manual calibration verification. [Conclusions] With sufficient data, the proposed method can effectively capture the changes in the relationship between entities without manual annotation. It will effectively and continuously predict the relationship of the financial knowledge graph. © 2023 The Author(s). |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#587
-
Zhang 2021
Multi-Turn Dialogue Reading Comprehension With Pivot Turns and Knowledge
IEEE-ACM Trans. Audio Speech Lang. 2021;29():1161-1173 2021 DOI: 10.1109/taslp.2021.3058616 · Ref ID: 3507 Multi-turn dialogue reading comprehension aims to teach machines to read dialogue contexts and solve tasks such as response selection and answering questions. The major challenges involve noisy history contexts and especial prerequisites of commonsense knowledge that is unseen in the given material. Existing works mainly focus on context and response matching approaches. This work thus makes the first attempt to tackle the above two challenges by extracting substantially important turns as pivot utterances and utilizing external knowledge to enhance the representation of context. We propose a pivot-oriented deep selection model (PoDS) on top of the Transformer-based language models for dialogue comprehension. In detail, our model first picks out the pivot utterances from the conversation history according to the semantic matching with the candidate response or question, if any. Besides, knowledge items related to the dialogue context are extracted from a knowledge graph as external knowledge. Then, the pivot utterances and the external knowledge are combined together with a well-designed mechanism for refining predictions. Experimental results on four dialogue comprehension benchmark tasks show that our proposed model achieves great improvements on baselines. A series of empirical comparisons are conducted to show how our selection strategies and the extra knowledge injection influence the results. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#237
-
Zhang 2019
ERNIE: Enhanced Language Representation with Informative Entities
57th Annual Meeting of the Association-for-Computational-Linguistics (ACL) 2019;():1441-1451 Florence, ITALY Assoc Computational Linguistics-Acl 2019 Ref ID: 3419 Neural language representation models such as BERT pre-trained on large-scale corpora can well capture rich semantic patterns from plain text, and be fine-tuned to consistently improve the performance of various NLP tasks. However, the existing pre-trained language models rarely consider incorporating knowledge graphs (KGs), which can provide rich structured knowledge facts for better language understanding. We argue that informative entities in KGs can enhance language representation with external knowledge. In this paper, we utilize both large-scale textual corpora and KGs to train an enhanced language representation model (ERNIE), which can take full advantage of lexical, syntactic, and knowledge information simultaneously. The experimental results have demonstrated that ERNIE achieves significant improvements on various knowledge-driven tasks, and meanwhile is comparable with the state-of-the-art model BERT on other common NLP tasks. The source code and experiment details of this paper can be obtained from https://github.com/thunlp/ERNIE. |
Srividya
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#666
-
Zhang 2020
Pretrain-KGE: Learning Knowledge Representation from Pretrained Language Models
Meeting of the Association-for-Computational-Linguistics (ACL-EMNLP) 2020;():259-266 Electr Network Assoc Computational Linguistics-Acl 2020 Ref ID: 3097 Conventional knowledge graph embedding (KGE) often suffers from limited knowledge representation, leading to performance degradation especially on the low-resource problem. To remedy this, we propose to enrich knowledge representation via pretrained language models by leveraging world knowledge from pretrained models. Specifically, we present a universal training framework named Pretrain-KGE consisting of three phases: semantic-based fine-tuning phase, knowledge extracting phase and KGE training phase. Extensive experiments show that our proposed Pretrain-KGE can improve results over KGE models, especially on solving the low-resource problem. |
Davis
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3289
-
Zhang 2024
Contrastive Learning for Knowledge-Based Question Generation in Large Language Models
arXiv 2024;(): 2024 Ref ID: 8617 With the rapid development of artificial intelligence technology, especially the increasingly widespread application of question-and-answer systems, high-quality question generation has become a key component in supporting the development of these systems. This article focuses on knowledge-based question generation technology, which aims to enable computers to simulate the human questioning process based on understanding specific texts or knowledge bases. In light of the issues of hallucination and knowledge gaps present in large-scale language models when applied to knowledge-intensive tasks, this paper proposes an enhanced question generation method that incorporates contrastive learning. This method utilizes multiple models to jointly mine domain knowledge and uses contrastive learning to guide the model in reducing noise and hallucinations in generation. Experimental results show that by designing prompts containing contrasting examples, the model's performance in question generation improves considerably, particularly when contrasting instructions and examples are used simultaneously, leading to the highest quality of generated questions and improved accuracy. These results demonstrate that the method proposed in this study, which combines contrasting context and chain-of-thought prompts, can effectively improve both the quality and the practicality of question generation. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3134
-
Zhang 2024
A GAIL Fine-Tuned LLM Enhanced Framework for Low-Resource Knowledge Graph Question Answering
Proceedings of the 33rd ACM International Conference on Information and Knowledge Management 2024;():3300–3309 Boise, ID, USA Association for Computing Machinery 2024 DOI: 10.1145/3627673.3679753 · Ref ID: 7116 |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3695
-
Zhang 2023
Make Them Spill the Beans! Coercive Knowledge Extraction from (Production) LLMs
arXiv 2023;(): 2023 Ref ID: 7976 Large Language Models (LLMs) are now widely used in various applications, making it crucial to align their ethical standards with human values. However, recent jail-breaking methods demonstrate that this alignment can be undermined using carefully constructed prompts. In our study, we reveal a new threat to LLM alignment when a bad actor has access to the model's output logits, a common feature in both open-source LLMs and many commercial LLM APIs (e.g., certain GPT models). It does not rely on crafting specific prompts. Instead, it exploits the fact that even when an LLM rejects a toxic request, a harmful response often hides deep in the output logits. By forcefully selecting lower-ranked output tokens during the auto-regressive generation process at a few critical output positions, we can compel the model to reveal these hidden responses. We term this process model interrogation. This approach differs from and outperforms jail-breaking methods, achieving 92% effectiveness compared to 62%, and is 10 to 20 times faster. The harmful content uncovered through our method is more relevant, complete, and clear. Additionally, it can complement jail-breaking strategies, with which results in further boosting attack performance. Our findings indicate that interrogation can extract toxic knowledge even from models specifically designed for coding tasks. |
Davis
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1072
-
Zhao 2021
A Chinese Machine Reading Comprehension Dataset Automatic Generated Based on Knowledge Graph
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2021;12869 LNAI():268-279 Springer Science and Business Media Deutschland GmbH 2021 DOI: 10.1007/978-3-030-84186-7_18 · Ref ID: 5607 Machine reading comprehension (MRC) is a typical natural language processing (NLP) task and has developed rapidly in the last few years. Various reading comprehension datasets have been built to support MRC studies. However, large-scale and high-quality datasets are rare due to the high complexity and huge workforce cost of making such a dataset. Besides, most reading comprehension datasets are in English, and Chinese datasets are insufficient. In this paper, we propose an automatic method for MRC dataset generation, and build the largest Chinese medical reading comprehension dataset presently named CMedRC. Our dataset contains 17k questions generated by our automatic method and some seed questions. We obtain the corresponding answers from a medical knowledge graph and manually check all of them. Finally, we test BiLSTM and BERT-based pre-trained language models (PLMs) on our dataset and propose a baseline for the following studies. Results show that the automatic MRC dataset generation method is considerable for future model improvements. © 2021, Springer Nature Switzerland AG. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3243
-
Zhao 2021
Calculating Question Similarity is Enough: A New Method for KBQA Tasks
arXiv 2021;(): 2021 Ref ID: 7496 Knowledge Base Question Answering (KBQA) aims to answer natural language questions with the help of an external knowledge base. The core idea is to find the link between the internal knowledge behind questions and known triples of the knowledge base. Traditional KBQA task pipelines contain several steps, including entity recognition, entity linking, answering selection, etc. In this kind of pipeline methods, errors in any procedure will inevitably propagate to the final prediction. To address this challenge, this paper proposes a Corpus Generation - Retrieve Method (CGRM) with Pre-training Language Model (PLM) for the KBQA task. The major novelty lies in the design of the new method, wherein our approach, the knowledge enhanced T5 (kT5) model aims to generate natural language QA pairs based on Knowledge Graph triples and directly solve the QA by retrieving the synthetic dataset. The new method can extract more information about the entities from PLM to improve accuracy and simplify the processes. We test our method on NLPCC-ICCPOL 2016 KBQA dataset, and the results show that our method improves the performance of KBQA and the out straight-forward method is competitive with the state-of-the-art. |
Mike
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1737
-
Zhao 2024
Power Large Language Model Exploration: Activation, Measurement and Enhancement for Operations and Maintenance Knowledge: Activation, Measurement and Enhancement for Power O&M Knowledge
ACM International Conference Proceeding Series 2024;():1-7 Association for Computing Machinery 2024 DOI: 10.1145/3689218.3689222 · Ref ID: 3847 With the rapid advancement of Large Language Models, their applications are gradually transitioning from general to specific domains. However, the application of LLM in the electric power domain is still in its early stages, and few studies have explored power LLM. Currently, there are two main challenges against power LLMs: (1) determining how to measure the real power knowledge capacity of LLMs to facilitate targeted enhancement of specific knowledge. (2) identifying practical enhancement methods to facilitate efficient and feasible power LLM applications in real-world scenarios. In this paper, we ask three insightful questions that address the power knowledge capacity of LLMs and then draw inspiration from Reflexion and CoT to design an Activation, Measurement and Enhancement framework (AME) for power operations and maintenance (O&M) knowledge. Specifically, we ask three "HOW"questions based on the activation, measurement, and enhancement of power O&M knowledge. We introduce a Reflexion Module to discover the knowledge capacity of LLM and a Knowledge Graph Module to provide external knowledge of LLM in our proposed AME. Experiments on the real-world dataset provide strong evidence when we answer the above three insightful questions. © 2024 ACM. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#746
-
Zhao 2024
Self-consistency, Extract and Rectify: Knowledge Graph Enhance Large Language Model for Electric Power Question Answering
20th International Conference on Intelligent Computing (ICIC) 2024;14873():493-504 Tianjin Univ Sci & Tech, Tianjin, PEOPLES R CHINA Springer-Verlag Singapore Pte Ltd 2024 DOI: 10.1007/978-981-97-5615-5_40 · Ref ID: 3172 Electric power artificial intelligence has rapidly advanced in recent years, encompassing safety detection, assistant decision-making, and optimal scheduling. With the rise of Large Language Models (LLMs), knowledge-based AI is becoming increasingly prevalent across various domains. However, in the field of electric power, most of the knowledge-based AI is centered on Knowledge Graph (KG) techniques, while less research has been done on power LLMs. In this paper, we are inspired by Self-Consistency (SC) and propose a Self-Consistency, Extraction and Rectify framework-SCER, for the usage of KG-enhanced LLM in power operations and maintenance (O&M) question answering scenarios. Specifically, we transfer the SC from the general-purpose domain into the power domain and replace the original model with a Chinese sentence representation model to make it more localized. We design an Extract Mechanism to generate evidence chains through multiple random walks on the POMKG and a Rectify Mechanism to correct the score of the generated rationales. Extensive experiments and specific case studies on the POMQA dataset demonstrate the effectiveness of our proposed SCER for SC transfer and improvement in the power field. |
Mike
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3246
-
Zhao 2024
Can Language Model Understand Word Semantics as A Chatbot? An Empirical Study of Language Model Internal External Mismatch
arXiv 2024;(): 2024 Ref ID: 8616 Current common interactions with language models is through full inference. This approach may not necessarily align with the model's internal knowledge. Studies show discrepancies between prompts and internal representations. Most focus on sentence understanding. We study the discrepancy of word semantics understanding in internal and external mismatch across Encoder-only, Decoder-only, and Encoder-Decoder pre-trained language models. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2268
-
Zhao 2014
A concept-based knowledge representation model for semantic entailment inference
Proceedings of the 33rd Chinese Control Conference 2014;():522-527 2014 DOI: 10.1109/ChiCC.2014.6896678 · Ref ID: 6050 Semantic entailment is a fundamental problem in natural language understanding field which has a large number of applications. Knowledge acquisition and knowledge representation are crucial parts in semantic inference strategies. This paper presents a principled approach to semantic entailment problem that builds on a concept-based knowledge representation model (CKR). This model formally defines the concept as a triple (attribute, relation and behavior) and the knowledge of a concept can be illustrated by the triple. We propose a semantic inference strategy that against identify text segments which with dissimilar surface form but share a common meaning. The inference strategy avoids syntactic analysis steps. A preliminary evaluation on the PASCAL text collection is presented. Experimental results show that our concept-based inference strategy is effective and has strong development potential. |
Mike
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3113
-
Zhao 2024
Breaking the Barrier: Utilizing Large Language Models for Industrial Recommendation Systems through an Inferential Knowledge Graph
Proceedings of the 33rd ACM International Conference on Information and Knowledge Management 2024;():5086–5093 Boise, ID, USA Association for Computing Machinery 2024 DOI: 10.1145/3627673.3680022 · Ref ID: 7138 |
Srividya
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#320
-
Zhao 2024
Graph Reasoning Transformers for Knowledge -Aware Question Answering
38th AAAI Conference on Artificial Intelligence (AAAI) / 36th Conference on Innovative Applications of Artificial Intelligence / 14th Symposium on Educational Advances in Artificial Intelligence 2024;():19652-19660 Vancouver, CANADA Assoc Advancement Artificial Intelligence 2024 Ref ID: 3470 Augmenting Language Models (LMs) with structured knowledge graphs (KGs) aims to leverage structured world knowledge to enhance the capability of LMs to complete knowledge -intensive tasks. However, existing methods are unable to effectively utilize the structured knowledge in a KG due to their inability to capture the rich relational semantics of knowledge triplets. Moreover, the modality gap between natural language text and KGs has become a challenging obstacle when aligning and fusing cross -modal information. To address these challenges, we propose a novel knowledge augmented question answering (QA) model, namely, Graph Reasoning Transformers (GRT). Different from conventional node-level methods, the GRT serves knowledge triplets as atomic knowledge and utilize a triplet-level graph encoder to capture triplet-level graph features. Furthermore, to alleviate the negative effect of the modality gap on joint reasoning, we propose a representation alignment pretraining to align the cross -modal representations and introduce a cross -modal information fusion module with attention bias to enable cross modal information fusion. Extensive experiments conducted on three knowledge-intensive QA benchmarks show that the GRT outperforms the state-of-the-art KG-augmented QA systems, demonstrating the effectiveness and adaptation of our proposed model. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3170
-
Zhao 2024
Zero-shot Knowledge Graph Question Generation via Multi-agent LLMs and Small Models Synthesis
Proceedings of the 33rd ACM International Conference on Information and Knowledge Management 2024;():3341–3351 Boise, ID, USA Association for Computing Machinery 2024 DOI: 10.1145/3627673.3679805 · Ref ID: 7114 |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3180
-
Zhao 2024
AGENTiGraph: An Interactive Knowledge Graph Platform for LLM-based Chatbots Utilizing Private Data
arXiv 2024;(): 2024 Ref ID: 8710 Large Language Models (LLMs) have demonstrated capabilities across various applications but face challenges such as hallucination, limited reasoning abilities, and factual inconsistencies, especially when tackling complex, domain-specific tasks like question answering (QA). While Knowledge Graphs (KGs) have been shown to help mitigate these issues, research on the integration of LLMs with background KGs remains limited. In particular, user accessibility and the flexibility of the underlying KG have not been thoroughly explored. We introduce AGENTiGraph (Adaptive Generative ENgine for Task-based Interaction and Graphical Representation), a platform for knowledge management through natural language interaction. It integrates knowledge extraction, integration, and real-time visualization. AGENTiGraph employs a multi-agent architecture to dynamically interpret user intents, manage tasks, and integrate new knowledge, ensuring adaptability to evolving user requirements and data contexts. Our approach demonstrates superior performance in knowledge graph interactions, particularly for complex domain-specific tasks. Experimental results on a dataset of 3,500 test cases show AGENTiGraph significantly outperforms state-of-the-art zero-shot baselines, achieving 95.12% accuracy in task classification and 90.45% success rate in task execution. User studies corroborate its effectiveness in real-world scenarios. To showcase versatility, we extended AGENTiGraph to legislation and healthcare domains, constructing specialized KGs capable of answering complex queries in legal and medical contexts. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1510
-
Zheng 2024
A KNOWLEDGE GRAPH MODELING APPROACH FOR AUGMENTING LANGUAGE MODEL-BASED CONTRACT RISK IDENTIFICATION
Proceedings of the European Conference on Computing in Construction 2024;2024():260-267 European Council on Computing in Construction (EC3) 2024 DOI: 10.35490/EC3.2024.178 · Ref ID: 4393 Contract risk identification is essential for preventing disputes and losses in construction industry. Large language models (LLMs) have impacted various natural language processing tasks, offering a promising avenue for automating contract review without extensive data processing and feature engineering. However, LLMs still has difficulty in recalling facts while generating knowledge-grounded analysis, especially when related to complex domain knowledge. This paper introduces a Knowledge Graph (KG) modeling approach to enhance the LLM-based automated contract risk identification. A case study demonstrates that our approach exhibits enhanced performance on risk identification tasks compared to non-augmentation scenario. © 2024 European Council on Computing in Construction. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3572
-
Zheng 2023
KGLens: Towards Efficient and Effective Knowledge Probing of Large Language Models with Knowledge Graphs
arXiv 2023;(): 2023 Ref ID: 7997 Large Language Models (LLMs) might hallucinate facts, while curated Knowledge Graph (KGs) are typically factually reliable especially with domain-specific knowledge. Measuring the alignment between KGs and LLMs can effectively probe the factualness and identify the knowledge blind spots of LLMs. However, verifying the LLMs over extensive KGs can be expensive. In this paper, we present KGLens, a Thompson-sampling-inspired framework aimed at effectively and efficiently measuring the alignment between KGs and LLMs. KGLens features a graph-guided question generator for converting KGs into natural language, along with a carefully designed importance sampling strategy based on parameterized KG structure to expedite KG traversal. Our simulation experiment compares the brute force method with KGLens under six different sampling methods, demonstrating that our approach achieves superior probing efficiency. Leveraging KGLens, we conducted in-depth analyses of the factual accuracy of ten LLMs across three large domain-specific KGs from Wikidata, composing over 19K edges, 700 relations, and 21K entities. Human evaluation results indicate that KGLens can assess LLMs with a level of accuracy nearly equivalent to that of human annotators, achieving 95.7% of the accuracy rate. |
yuexi
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3262
-
Zheng 2024
CLR-Fact: Evaluating the Complex Logical Reasoning Capability of Large Language Models over Factual Knowledge
arXiv 2024;(): 2024 Ref ID: 8495 While large language models (LLMs) have demonstrated impressive capabilities across various natural language processing tasks by acquiring rich factual knowledge from their broad training data, their ability to synthesize and logically reason with this knowledge in complex ways remains underexplored. In this work, we present a systematic evaluation of state-of-the-art LLMs' complex logical reasoning abilities through a novel benchmark of automatically generated complex reasoning questions over general domain and biomedical knowledge graphs. Our extensive experiments, employing diverse in-context learning techniques, reveal that LLMs excel at reasoning over general world knowledge but face significant challenges with specialized domain-specific knowledge. We find that prompting with explicit Chain-of-Thought demonstrations can substantially improve LLM performance on complex logical reasoning tasks with diverse logical operations. Interestingly, our controlled evaluations uncover an asymmetry where LLMs display proficiency at set union operations, but struggle considerably with set intersections - a key building block of logical reasoning. To foster further work, we will publicly release our evaluation benchmark and code. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3613
-
Zheng 2024
A Knowledge-Enhanced Disease Diagnosis Method Based on Prompt Learning and BERT Integration
arXiv 2024;(): 2024 Ref ID: 8603 This paper proposes a knowledge-enhanced disease diagnosis method based on a prompt learning framework. The method retrieves structured knowledge from external knowledge graphs related to clinical cases, encodes it, and injects it into the prompt templates to enhance the language model's understanding and reasoning capabilities for the task.We conducted experiments on three public datasets: CHIP-CTC, IMCS-V2-NER, and KUAKE-QTR. The results show that the proposed method significantly outperforms existing models across multiple evaluation metrics, with an F1 score improvement of 2.4% on the CHIP-CTC dataset, 3.1% on the IMCS-V2-NER dataset,and 4.2% on the KUAKE-QTR dataset. Additionally,ablation studies confirmed the critical role of the knowledge injection module,as the removal of this module resulted in a significant drop in F1 score. The experimental results demonstrate that the proposed method not only effectively improves the accuracy of disease diagnosis but also enhances the interpretability of the predictions, providing more reliable support and evidence for clinical diagnosis. |
Kwesi
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#887
-
Zhengbao 2020
X-FACTR: Multilingual Factual Knowledge Retrieval from Pretrained Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020;():5943-5959 Electr Network Assoc Computational Linguistics-Acl 2020 Ref ID: 3649 Language models (LMs) have proven surprisingly successful at capturing factual knowledge by completing cloze-style fill-in-theblank questions such as "Punta Cana is located in _." However, while knowledge is both written and queried in many languages, studies on LMs' factual representation ability have almost invariably been performed on English. To assess factual knowledge retrieval in LMs in different languages, we create a multilingual benchmark of cloze-style probes for 23 typologically diverse languages. To properly handle language variations, we expand probing methods from single- to multi-word entities, and develop several decoding algorithms to generate multi-token predictions. Extensive experimental results provide insights about how well (or poorly) current state-of-theart LMs perform at this task in languages with more or fewer available resources. We further propose a code-switching-based method to improve the ability of multilingual LMs to access knowledge, and verify its effectiveness on several benchmark languages. Benchmark data and code have be released at https: //x-factr.github.io. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1840
-
Zhong 2023
RoMQA: A Benchmark for Robust, Multi-evidence, Multi-answer Question Answering
Findings of the Association for Computational Linguistics: EMNLP 2023 2023;():7055-7067 Association for Computational Linguistics (ACL) 2023 Ref ID: 5049 We introduce RoMQA, the first benchmark for robust, multi-evidence, multi-answer question answering (QA). RoMQA contains clusters of questions that are derived from related constraints mined from the Wikidata knowledge graph. RoMQA evaluates robustness of QA models to varying constraints by measuring worst-case performance within each question cluster. Compared to prior QA datasets, RoMQA has more human-written questions that require reasoning over more evidence text and have, on average, many more correct answers. In addition, human annotators rate RoMQA questions as more natural or likely to be asked by people. We evaluate state-of-the-art large language models in zero-shot, few-shot, and fine-tuning settings, and find that RoMQA is challenging: zero-shot and few-shot models perform similarly to naive baselines, while supervised retrieval methods perform well below gold evidence upper bounds. Moreover, existing models are not robust to variations in question constraints, but can be made more robust by tuning on clusters of related questions. Our results show that RoMQA is a challenging benchmark for large language models, and provides a quantifiable test to build more robust QA methods. © 2023 Association for Computational Linguistics. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#759
-
Zhong 2022
Semantics Driven Embedding Learning for Effective Entity Alignment
38th IEEE International Conference on Data Engineering (ICDE) 2022;():2127-2140 Electr Network Ieee Computer Soc 2022 DOI: 10.1109/icde53745.2022.00205 · Ref ID: 3568 Knowledge-based data service has become an emerging form of service in the world wide web (WWW). To ensure the service quality, a comprehensive knowledge base has to be constructed. Knowledge base integration is often a primary way to improve the completeness. In this paper, we focus on the fundamental problem in knowledge base integration, i.e., entity alignment (EA). EA has been studied for years. Traditional approaches focus on the symbolic features of entities and propose various similarity measures to identify equivalent entities. With recent development in knowledge graph representation learning, embedding-based entity alignment has emerged, which encodes the entities into vectors according to the semantic or structural information and computes the relatedness of entities based on the vector representation. While embedding-based approaches achieve promising results, we identify some important information that are not well exploited in existing works: 1) The neighboring entities contribute differently in the EA process, and should be carefully assigned the importance in learning the relatedness of entities; 2) The attribute values (especially the long texts) contain rich semantics that can build supplementary associations between entities. To this end, we propose SDEA - a Semantics Driven entity embedding method for Entity Alignment. SDEA consists of two modules, namely attribute embedding and relation embedding. The attribute embedding captures the semantic information from attribute values with a pre-trained transformer-based language model. The relation embedding selectively aggregates the semantic information from neighbors using a GRU model equipped with an attention mechanism. Both attribute embedding and relation embedding are driven by semantics, building bridges between entities. Experimental results show that our method significantly outperforms the state-of-the-art approaches on three benchmarks. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#525
-
Zhou 2022
Leveraging on causal knowledge for enhancing the root cause analysis of equipment spot inspection failures
Causal correlation data over the equipment spot-inspection operation and maintenance (O&M) records and fault investigation sheets potentially reflect the state related to the causal effect of equipment failures. Various factors influence equipment failures, making it difficult to effectively analyze the main cause of the problems. Mining and leveraging these causal data from the equipment spot inspection records will undoubtedly significantly improve the root cause analysis of the fault in the O&M system. Hence, this paper introduces causal knowledge in equipment fault O&M for the first time and proposes to exploit causal knowledge for enhancing root cause analysis of equipment spot inspection failures. Specifically, an equipment fault O&M knowledge graph with causal knowledge called CausalKG is constructed to provide knowledge support for the causal analysis of faults. That is, CausalKG consists of spot-inspection knowledge graph (SIKG) and causal relationship knowledge (CRK) in equipment fault O&M. Further, a CausalKG-ALBERT knowledge reasoning model is designed. The model transforms CausalKG into network embeddings based on relational graph convolutional networks. In turn, it combines the Q&A mechanism of the language model ALBERT to mine the root cause knowledge of equipment failures. The case study confirms that incorporating the CRK is more effective than directly using the SIKG for causality reasoning; The model can fully use causal relationship knowledge to enhance the reliability of root cause analysis. This method is valuable to help engineers strengthen their causal analysis capabilities in pre-ventive equipment maintenance. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#92
-
Zhou 2024
CausalKGPT: Industrial structure causal knowledge-enhanced large language model for cause analysis of quality problems in aerospace product manufacturing
The whole cycle for manufacturing aerospace thin-walled shells is a lengthy and sophisticated process. A large amount of quality-related data exists within and between processes, involving many types of quality defects and influencing factors. However, there are ambiguous causal associations among quality-related data affecting the shape-properties of the shell. Also, the coupling of long processes and multiple factors makes it hard to analyze the main factors that affect the quality defects in shell manufacturing. In this paper, taking into account the advantages of causal Scientology and the large language model (LLM), we propose an industrial structure causal knowledge-enhanced large language model for the cause analysis of quality defects in aerospace product manufacturing. To reinforce the causal associations among quality-related data deriving from manufacturing documents (product defect survey sheets, quality inspection, and maintenance reports), a structure causal graphbased sum-product network (SCG-SPN) model is designed to model machining quality-related knowledge and eliminate pseudo-association confounding factors by doing an intervention. Thus, a causal quality-related knowledge graph (CQKG) with high-quality causal associations is constructed. With this, to provide a trustworthy guarantee in responding to quality problem solving, we construct a quality-related prompt dataset with multi-round conversations based on CQKG. Then, a novel P-tuning that adapts to utilize external CQKG instructions is designed to fine-tune an open-source ChatGLM base model. Based on this, a causal knowledge graphaugmented LLM, named CausalKGPT, is developed to enable reasoning and responding to quality defects in both Chinese and English. It uses natural text descriptions related to quality defects as input and takes a qualityrelated causal knowledge graph as an additional corpus. Finally, the case study shows that the CausalKGPT performs with more expertise and reliability in responding to quality question solving of aerospace shell manufacturing than the classic commercial models like ChatGPT and GPT4. The results indicate that the proposed method may provide a trustworthy guide in assisting workers to analyze quality defects in aerospace products. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1774
-
Zhou 2023
PROTEIN REPRESENTATION LEARNING VIA KNOWLEDGE ENHANCED PRIMARY STRUCTURE MODELING
11th International Conference on Learning Representations, ICLR 2023 2023;(): International Conference on Learning Representations, ICLR 2023 Ref ID: 4991 Protein representation learning has primarily benefited from the remarkable development of language models (LMs). Accordingly, pre-trained protein models also suffer from a problem in LMs: a lack of factual knowledge. The recent solution models the relationships between protein and associated knowledge terms as the knowledge encoding objective. However, it fails to explore the relationships at a more granular level, i.e., the token level. To mitigate this, we propose Knowledge-exploited Auto-encoder for Protein (KeAP), which performs token-level knowledge graph exploration for protein representation learning. In practice, non-masked amino acids iteratively query the associated knowledge tokens to extract and integrate helpful information for restoring masked amino acids via attention. We show that KeAP can consistently outperform the previous counterpart on 9 representative downstream applications, sometimes surpassing it by large margins. These results suggest that KeAP provides an alternative yet effective way to perform knowledge enhanced protein representation learning. Code and models are available at https://github.com/RL4M/KeAP. © 2023 11th International Conference on Learning Representations, ICLR 2023. All rights reserved. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3107
-
Zhou 2024
Automated Medical Report Generation and Visual Question Answering
Proceedings of the 1st International Workshop on Multimedia Computing for Health and Medicine 2024;():3–4 Melbourne VIC, Australia Association for Computing Machinery 2024 DOI: 10.1145/3688868.3689189 · Ref ID: 7309 |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3398
-
Zhou 2024
Establishing Knowledge Preference in Language Models
arXiv 2024;(): 2024 Ref ID: 8470 Language models are known to encode a great amount of factual knowledge through pretraining. However, such knowledge might be insufficient to cater to user requests, requiring the model to integrate external knowledge sources and adhere to user-provided specifications. When answering questions about ongoing events, the model should use recent news articles to update its response; when asked to provide recommendations, the model should prioritize user specifications over retrieved product reviews; when some facts are edited in the model, the updated facts should override all prior knowledge learned by the model even if they are conflicting. In all of the cases above, the model faces a decision between its own parametric knowledge, (retrieved) contextual knowledge, and user instruction knowledge. In this paper, we (1) unify such settings into the problem of knowledge preference and define a three-level preference hierarchy over these knowledge sources; (2) compile a collection of existing datasets IfQA, MQuAKE, and MRQA covering a combination of settings (with/without user specifications, with/without context documents) to systematically evaluate how well models obey the intended knowledge preference; and (3) propose a dataset synthesis method that composes diverse question-answer pairs with user assumptions and related context to directly fine-tune LMs for instilling the hierarchy of knowledge. We demonstrate that a 7B model, fine-tuned on only a few thousand examples automatically generated by our proposed method, effectively achieves superior performance (more than 18% improvement across all evaluation benchmarks) in adhering to the desired knowledge preference hierarchy. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#155
-
Zhou 2024
D-Bot: Database Diagnosis System using Large Language Models
Database administrators (DBAs) play an important role in managing database systems. However, it is hard and tedious for DBAs to manage vast database instances and give timely response (waiting for hours is intolerable in many online cases). In addition, existing empirical methods only support limited diagnosis scenarios, which are also labor-intensive to update the diagnosis rules for database version updates. Recently large language models (LLMs) have shown great potential in various fields. Thus, we propose D-Bot, an LLM-based database diagnosis system that can automatically acquire knowledge from diagnosis documents, and generate reasonable and well-founded diagnosis report (i.e., identifying the root causes and solutions) within acceptable time (e.g., under 10 minutes compared to hours by a DBA). The techniques in D-Bot include (i) offline knowledge extraction from documents, (ii) automatic prompt generation (e.g., knowledge matching, tool retrieval), (iii) root cause analysis using tree search algorithm, and (iv) collaborative mechanism for complex anomalies with multiple root causes. We verify D-Bot on real benchmarks (including 539 anomalies of six typical applications), and the results show D-Bot can effectively identify root causes of unseen anomalies and significantly outperforms traditional methods and vanilla models like GPT-4. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1935
-
Zhou 2024
Temporal Closing Path for PLM-based Temporal Knowledge Graph Completion
Proceedings of the International Joint Conference on Neural Networks 2024;(): Institute of Electrical and Electronics Engineers Inc. 2024 DOI: 10.1109/IJCNN60899.2024.10650003 · Ref ID: 4238 Temporal Knowledge Graph Completion (TKGC) aims to predict missing parts of quadruples, which is crucial for real-life knowledge graphs. Compared with methods that only use graph neural networks, the emergence of pre-trained model has introduced a trend of simultaneously leveraging text and graph structure information. However, most current methods based on pre-trained models struggle to effectively utilize both text and multi-hop graph structure information concurrently, resulting in insufficient association mining of relations. To address the challenge, we propose a novel model: Temporal Closing Path for Pre-trained Language Model-based TKGC (TCP-PLM). We obtain the temporal closing relation path of the target relation through sampling, and use the relation path as a bridge to simultaneously utilize text and multi-hop graph structure information. Moreover, the relation path serves as a tool for mining associations between relations. At the same time, due to the design of entity-independent relation paths, our model can also handle the inductive setting. Our experiments on three benchmarks, along with extensive analysis, demonstrate that our model not only achieves substantial performance enhancements across four metrics compared to other models but also adeptly handles inductive settings. © 2024 IEEE. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3957
-
Zhou 2024
Unveiling and Consulting Core Experts in Retrieval-Augmented MoE-based LLMs
arXiv 2024;(): 2024 Ref ID: 8736 Retrieval-Augmented Generation (RAG) significantly improved the ability of Large Language Models (LLMs) to solve knowledge-intensive tasks. While existing research seeks to enhance RAG performance by retrieving higher-quality documents or designing RAG-specific LLMs, the internal mechanisms within LLMs that contribute to the effectiveness of RAG systems remain underexplored. In this paper, we aim to investigate these internal mechanisms within the popular Mixture-of-Expert (MoE)-based LLMs and demonstrate how to improve RAG by examining expert activations in these LLMs. Our controlled experiments reveal that several core groups of experts are primarily responsible for RAG-related behaviors. The activation of these core experts can signify the model's inclination towards external/internal knowledge and adjust its behavior. For instance, we identify core experts that can (1) indicate the sufficiency of the model's internal knowledge, (2) assess the quality of retrieved documents, and (3) enhance the model's ability to utilize context. Based on these findings, we propose several strategies to enhance RAG's efficiency and effectiveness through expert activation. Experimental results across various datasets and MoE-based LLMs show the effectiveness of our method. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1985
-
Zhou 2023
Traditional Chinese Medicine Epidemic Prevention and Treatment Question-Answering Model Based on LLMs
Proceedings - 2023 2023 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2023 2023;():4755-4760 Institute of Electrical and Electronics Engineers Inc. 2023 DOI: 10.1109/BIBM58861.2023.10385748 · Ref ID: 4919 Background: Epidemic diseases in Traditional Chinese Medicine (TCM) constitute an essential part of Chinese medical science. TCM has accumulated rich theoretical and practical experiences in the prevention and treatment of epidemic diseases, forming the academic system of epidemic febrile disease, providing robust support for epidemic prevention and resistance in TCM. However, the numerous and complex literature on TCM epidemic diseases brings challenges to the organization and discovery of epidemic disease knowledges. Objective: To leverage the powerful knowledge learning ability of state-of-the-art LLMs (LLMs) to address the efficient acquisition and utilization of TCM epidemic disease knowledges. Methods: By collecting content related to epidemic diseases from 194 ancient TCM books, as well as the knowledge graph of TCM epidemic disease prevention and treatment, we built the large TCM epidemic disease model EpidemicCHAT based on the ChatGLM model. To assess the performances of the model, several open-source LLMs were compared in the study. Results: Compared to traditional LLMs, which may fail to answer or produce hallucinations in the field of TCM epidemic diseases, EpidemicCHAT demonstrates superior answering and reasoning abilities. In the evaluation of TCM epidemic disease prescription generation, the model achieved scores of 44.02, 61.10, and 59.40 on the BLEU-4, ROUGE-L, and METEOR metrics, respectively. Conclusion: The EpidemicCHAT model proposed in this study performs excellently in the field of TCM epidemic diseases, which might provide a reference for the construction of TCM LLMs and applications such as TCM auxiliary diagnosis and Chinese herbal prescription generation. © 2023 IEEE. |
Kwesi
voted
brandon
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1733
-
Zhu 2024
PoLLMgraph: Unraveling Hallucinations in Large Language Models via State Transition Dynamics
Findings of the Association for Computational Linguistics: NAACL 2024 - Findings 2024;():4737-4751 Association for Computational Linguistics (ACL) 2024 Ref ID: 4580 Despite tremendous advancements in large language models (LLMs) over recent years, a notably urgent challenge for their practical deployment is the phenomenon of “hallucination”, where the model fabricates facts and produces non-factual statements. In response, we propose PoLLMgraph-a Polygraph for LLMs-as an effective model-based white-box detection and forecasting approach. PoLLMgraph distinctly differs from the large body of existing research that concentrates on addressing such challenges through black-box evaluations. In particular, we demonstrate that hallucination can be effectively detected by analyzing the LLM's internal state transition dynamics during generation via tractable probabilistic models. Experimental results on various open-source LLMs confirm the efficacy of PoLLMgraph, outperforming state-of-the-art methods by a considerable margin, evidenced by over 20% improvement in AUCROC on common benchmarking datasets like TruthfulQA. Our work paves a new way for model-based white-box analysis of LLMs, motivating the research community to further explore, understand, and refine the intricate dynamics of LLM behaviors. © 2024 Association for Computational Linguistics. |
yuexi
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#321
-
Zhu 2024
Graph Structure Enhanced Pre-Training Language Model for Knowledge Graph Completion
IEEE Trans. Emerg. Top. Comput. Intell. 2024;8(4):2697-2708 2024 DOI: 10.1109/tetci.2024.3372442 · Ref ID: 2918 A vast amount of textual and structural information is required for knowledge graph construction and its downstream tasks. However, most of the current knowledge graphs are incomplete due to the difficulty of knowledge acquisition and integration. Knowledge Graph Completion (KGC) is used to predict missing connections. In previous studies, textual information and graph structural information are utilized independently, without an effective method for fusing these two types of information. In this paper, we propose a graph structure enhanced pre-training language model for knowledge graph completion. Firstly, we design a graph sampling algorithm and a Graph2Seq module for constructing sub-graphs and their corresponding contexts to support large-scale knowledge graph learning and parallel training. It is also the basis for fusing textual data and graph structure. Next, two pre-training tasks based on masked modeling are designed for capturing accurate entity-level and relation-level information. Furthermore, this paper proposes a novel asymmetric Encoder-Decoder architecture to restore masked components, where the encoder is a Pre-trained Language Model (PLM) and the decoder is a multi-relational Graph Neural Network (GNN). The purpose of the architecture is to integrate textual information effectively with graph structural information. Finally, the model is fine-tuned for KGC tasks on two widely used public datasets. The experiments show that the model achieves excellent performance and outperforms baselines in most metrics, which demonstrate the effectiveness of our approach by fusing the structure and semantic information to knowledge graph. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2188
-
Zhu 2023
Automating Method Naming with Context-Aware Prompt-Tuning
2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC) 2023;():203-214 2023 DOI: 10.1109/ICPC58990.2023.00035 · Ref ID: 6816 Method names are crucial to program comprehension and maintenance. Recently, many approaches have been proposed to automatically recommend method names and detect inconsistent names. Despite promising, their results are still suboptimal considering the three following drawbacks: 1) These models are mostly trained from scratch, learning two different objectives simultaneously. The misalignment between two objectives will negatively affect training efficiency and model performance. 2) The enclosing class context is not fully exploited, making it difficult to learn the abstract functionality of the method. 3) Current method name consistency checking methods follow a generate-then-compare process, which restricts the accuracy as they highly rely on the quality of generated names and face difficulty measuring the semantic consistency.In this paper, we propose an approach named AUMENA to AUtomate MEthod NAming tasks with context-aware prompt-tuning. Unlike existing deep learning based approaches, our model first learns the contextualized representation(i.e., class attributes) of programming language and natural language through the pre-training model, then fully exploits the capacity and knowledge of large language model with prompt-tuning to precisely detect inconsistent method names and recommend more accurate names. To better identify semantically consistent names, we model the method name consistency checking task as a two-class classification problem, avoiding the limitation of previous generate-then-compare consistency checking approaches. Experiment results reflect that AUMENA scores 68.6%, 72.0%, 73.6%, 84.7% on four datasets of method name recommendation, surpassing the state-of-the-art baseline by 8.5%, 18.4%, 11.0%, 12.0%, respectively. And our approach scores 80.8% accuracy on method name consistency checking, reaching an 5.5% outperformance. All data and trained models are publicly available. |
Kwesi
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#46
-
Zhu 2023
Automated extraction of domain knowledge in the dairy industry
Three weeks prior to calving to three weeks after calving, the transition period poses challenges for dairy cattle and farmers. Vast changes in housing, feeding, and reproduction might result in milk drop, metabolic and reproductive diseases. Moreover, most of the metabolic processes are intricately linked as many conditions can coexist. This challenge means that dairy producers and their advisors have difficulty drawing concise conclusions because of all aspects and relationships in transition cow management. Herein, machine-learning techniques and knowledge-graph theory were explored with a view to creating a decision-support system that could provide producers and their advisors with knowledge from domain literature. Specifically, knowledge is modelled as entities and relationships in knowledge graph theory, and natural language models were developed to extract information as knowledge graphs. A dataset comprising 1152 sentences from 20 papers was created and split into 922 sentences for training and 230 sentences for testing. Sequentially, two deep learning models were trained to extract entities and relationships respectively. For training results, a Bi-directional Long-Short-Term Memory model was applied for the entity extraction task and obtained an F1 score of 80 %. As for relationship extraction, a Transformer-based model was deployed but yielded a low F1 of 23 %, thus another pre-trained Transformer model with 89 % accuracy was deployed into the system. After feeding the domain literature into the deep -learning models, a knowledge graph of 1,576 nodes and 3,456 edges was constructed and stored in the graph database Neo4j. Afterward, a semantic parsing method was used to allow users to conduct question answering through the knowledge graph in natural language. In addition, to determine the quality of answers that the knowledge built from the papers, answers were sampled and evaluated based on human judgment. On average, answers scored 7.5 out of 10 and proved informative with respect to the original literature. Although the final interactive results demonstrated a high degree of visualization and scalability, this study primarily sought to demonstrate its feasibility. For tailored commercial applications, further improvements could be implemented in knowledge graph expansion and reasoning. |
Davis
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#484
-
Zhu 2023
KPT: Keyword-Guided Pre-training for Grounded Dialog Generation
37th AAAI Conference on Artificial Intelligence (AAAI) / 35th Conference on Innovative Applications of Artificial Intelligence / 13th Symposium on Educational Advances in Artificial Intelligence 2023;():14065-14073 Washington, DC Assoc Advancement Artificial Intelligence 2023 Ref ID: 3553 Incorporating external knowledge into the response generation process is essential to building more helpful and reliable dialog agents. However, collecting knowledge-grounded conversations is often costly, calling for a better pre-trained model for grounded dialog generation that generalizes well w.r.t. different types of knowledge. In this work, we propose KPT (Keyword-guided Pre-Training), a novel self-supervised pre-training method for grounded dialog genera-tion without relying on extra knowledge annotation. Specifically, we use a pre-trained language model to extract the most uncertain tokens in the dialog as keywords. With these keywords, we construct two kinds of knowledge and pre-train a knowledge-grounded response generation model, aiming at handling two different scenarios: (1) the knowledge should be faithfully grounded; (2) it can be selectively used. For the former, the grounding knowledge consists of keywords extracted from the response. For the latter, the grounding knowledge is additionally augmented with keywords extracted from other utterances in the same dialog. Since the knowledge is extracted from the dialog itself, KPT can be easily performed on a large volume and variety of dialogue data. We considered three data sources (open-domain, task-oriented, conversational QA) with a total of 2.5M dialogues. We conduct extensive experiments on various few-shot knowledge-grounded generation tasks, including grounding on dialog acts, knowledge graphs, persona descriptions, and Wikipedia passages. Our comprehensive experiments and analyses demonstrate that KPT consistently outperforms state-of-the-art methods on these tasks with diverse grounding knowledge. |
brandon
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2613
-
Zhu 2011
Knowledge Management Method for Expert System Based on Cognitive Model
2011 International Conference of Information Technology, Computer Engineering and Management Sciences 2011;4():77-80 2011 DOI: 10.1109/ICM.2011.352 · Ref ID: 6079 A living expert system needs a mechanism to update and increase knowledge to adapt this changeable world and the knowledge acquired by different approaches need storage in order for reasoning and updating conveniently. The method presented in this paper bears this mission. Simulating the learning procedure of human beings is the core idea of this method from which we can find the ways how to add, delete, amend and use the knowledge in an expert system. Based on the analysis of the common procedure of children's actions during recognizing the world, a cognitive model of concept learning is abstracted. A general concept learning algorithm, a knowledge representation method based on general rules, a logical structure in the forest shape, and a uniform data structure for storage are accordingly presented. Thus, a complete and more scientific management case for the knowledge base of expert system is provided. At last, comparing with some ontology knowledge bases, such as CYC, Word Net, and NKI, two different characteristics of this management method are discussed. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#865
-
Zhu 2017
Using Knowledge Graph And Search Query Click Logs in Statistical Language Model For Speech Recognition
18th Annual Conference of the International-Speech-Communication-Association (INTERSPEECH 2017) 2017;():2735-2738 Stockholm, SWEDEN Isca-Int Speech Communication Assoc 2017 DOI: 10.21437/Interspeech.2017-1790 · Ref ID: 3027 This paper demonstrates how Knowledge Graph (KG) and Search Query Click Logs (SQCL) can be leveraged in statistical language models to improve named entity recognition for online speech recognition systems. Due to the missing in the training data, some named entities may be recognized as other common words that have the similar pronunciation. KG and SQCL cover comprehensive and fresh named entities and queries that can be used to mitigate the wrong recognition. First, all the entities located in the same area in KG are clustered together. and the queries that contain the entity names are selected from SQCL as the training data of a geographical statistical language model for each entity cluster. These geographical language models make the unseen named entities less likely to occur during the model training, and can be dynamically switched according to the user location in the recognition phase. Second, if any named entities are identified in the previous utterances within a conversational dialog, the probability of the n-best word sequence paths that contain their related entities will be increased for the current utterance by utilizing the entity relationships from KG and SQCL. This way can leverage the long-term contexts within the dialog. Experiments for the proposed approach on voice queries from a spoken dialog system yielded a 12.5% relative perplexity reduction in the language model measurement, and a 1.1% absolute word error rate reduction in the speech recognition measurement. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#971
-
Zhu 2021
An Advanced Smart Contract Conversion and Its Design and Implementation for Auction Contract
As second-generation blockchain technology, smart contracts have greatly enriched the functional expression of blockchain to make application development more convenient. Smart contracts are a set of digitally executable protocols which concern business, finance, contract law, and information technology. In recent years, advanced smart contract languages (ASCLs) have been proposed to solve the problem of difficult reading, comprehension, and collaboration when writing a smart contract among people in different fields. However, this kind of languages are still hard to put into practice due to the lack of an effective conversion method from the ASCLs to executable smart contract programs. Aiming at this problem, we propose a three-layer smart contract framework, including advanced smart-contract layer, basic smart-contract layer, and executable machine-code layer. After comparing and analyzing the pros and cons of several ASCLs, we take SPESC as an example to explore how to design conversion rules from its contract to target language contract in Solidity. We specify the conversion rules from two aspects. One is program architecture of the target language, which consists of main-contract and party-contracts. The corresponding rules provide an approach to convert the definition of SPESC-based contracting parties into party sub-contracts on target language, as well as to produce the rest of SPESC contract into main sub-contract on target language. The other is the approach to specify not only program architecture and storage structure on basic smart-contract layer, but also important mechanisms, including personnel management, timing control, anomaly detection, etc. These mechanisms can assist programmers to semi-automatically write smart contract programs. Moreover, by introducing the notation of group, the SPESC-based smart contract can support the operation of dynamically adding participants into the contract. We also verify the legibility of SPESC and the correctness of the conversion processes through two case studies. First, we invite some students from department of computer science and department of law. They, divided into four groups, are asked to read voting and auction contracts in SPESC and Solidity, and answer questions designed for the contracts. The result shows that the speed of reading SPESC is about twice as fast as that of reading Solidity, and the accuracy of reading SPESC is higher. Then, taking the auction contract as an instance, we analyze the process of bidding contracts and compile them into contracts in SPESC, and then provide the whole process of converting from a SPESC-based contract to an executable contract program in Solidity according to the above conversion rules, and verify the correctness of the conversion process, including coding, deploying, running, and testing, through Ethereum private chain. The instance results show that the conversion rules and the three-layer framework can simplify the writing of smart contracts, standardize the program structure, and help programmers to verify the correctness of the contract program. In our future work, a formal representation shall be established on the existing SPESC language model. Through formal methods, we can further provide formal analysis tools to verify pre-and-post conditions of contract terms, as well as time sequence between terms. Secondly, in view of the correctness of the generated Solidity target code, we can continue to improve the generated target code based on existing researches on analysis or detection vulnerabilities, optimize the program structure and specifications, and enhance the security of the contract. © 2021, Science Press. All right reserved. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3366
-
Zhu 2024
EMERGE: Integrating RAG for Improved Multimodal EHR Predictive Modeling
arXiv 2024;(): 2024 Ref ID: 8339 The integration of multimodal Electronic Health Records (EHR) data has notably advanced clinical predictive capabilities. However, current models that utilize clinical notes and multivariate time-series EHR data often lack the necessary medical context for precise clinical tasks. Previous methods using knowledge graphs (KGs) primarily focus on structured knowledge extraction. To address this, we propose EMERGE, a Retrieval-Augmented Generation (RAG) driven framework aimed at enhancing multimodal EHR predictive modeling. Our approach extracts entities from both time-series data and clinical notes by prompting Large Language Models (LLMs) and aligns them with professional PrimeKG to ensure consistency. Beyond triplet relationships, we include entities' definitions and descriptions to provide richer semantics. The extracted knowledge is then used to generate task-relevant summaries of patients' health statuses. These summaries are fused with other modalities utilizing an adaptive multimodal fusion network with cross-attention. Extensive experiments on the MIMIC-III and MIMIC-IV datasets for in-hospital mortality and 30-day readmission tasks demonstrate the superior performance of the EMERGE framework compared to baseline models. Comprehensive ablation studies and analyses underscore the efficacy of each designed module and the framework's robustness to data sparsity. EMERGE significantly enhances the use of multimodal EHR data in healthcare, bridging the gap with nuanced medical contexts crucial for informed clinical predictions. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3126
-
Zhu 2024
EMERGE: Enhancing Multimodal Electronic Health Records Predictive Modeling with Retrieval-Augmented Generation
Proceedings of the 33rd ACM International Conference on Information and Knowledge Management 2024;():3549–3559 Boise, ID, USA Association for Computing Machinery 2024 DOI: 10.1145/3627673.3679582 · Ref ID: 7301 |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3306
-
Zhu 2024
Croppable Knowledge Graph Embedding
arXiv 2024;(): 2024 Ref ID: 8443 Knowledge Graph Embedding (KGE) is a common method for Knowledge Graphs (KGs) to serve various artificial intelligence tasks. The suitable dimensions of the embeddings depend on the storage and computing conditions of the specific application scenarios. Once a new dimension is required, a new KGE model needs to be trained from scratch, which greatly increases the training cost and limits the efficiency and flexibility of KGE in serving various scenarios. In this work, we propose a novel KGE training framework MED, through which we could train once to get a croppable KGE model applicable to multiple scenarios with different dimensional requirements, sub-models of the required dimensions can be cropped out of it and used directly without any additional training. In MED, we propose a mutual learning mechanism to improve the low-dimensional sub-models performance and make the high-dimensional sub-models retain the capacity that low-dimensional sub-models have, an evolutionary improvement mechanism to promote the high-dimensional sub-models to master the knowledge that the low-dimensional sub-models can not learn, and a dynamic loss weight to balance the multiple losses adaptively. Experiments on 3 KGE models over 4 standard KG completion datasets, 3 real application scenarios over a real-world large-scale KG, and the experiments of extending MED to the language model BERT show the effectiveness, high efficiency, and flexible extensibility of MED. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1593
-
Zia 2024
Leveraging large language models for automated knowledge graphs generation in non-destructive testing
CEUR Workshop Proceedings 2024;3760():101-110 CEUR-WS 2024 Ref ID: 4218 This paper presents an innovative approach for the automatic generation of Knowledge Graphs (KGs) from heterogeneous scientific articles in the domain of Non-Destructive Testing (NDT) applied to building materials. Our methodology leverages large language models (LLMs) to extract and semantically relate concepts from diverse sources. We developed material-specific agents for concrete, wood, steel, and bricks, each equipped with a curated glossary of terms to ensure domain accuracy. These agents process PDF documents, extracting relevant information on deterioration mechanisms, physical changes, and applicable NDT methods. The extracted data is then normalized, validated, and structured into a Neo4j graph database, forming a comprehensive KG. Our results demonstrate the system's ability to automatically discover and represent intricate relationships between materials, deterioration mechanisms, physical changes, and NDT techniques. The generated KG successfully captures complex interactions, such as the applicability of specific NDT methods to various materials under different deterioration conditions. This work not only highlights the potential of KGs in enhancing knowledge discovery and representation in NDT research but also provides a scalable framework for extending this approach to other scientific domains. © 2024 CEUR-WS. All rights reserved. |
Srividya
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2713
-
Zimina 2018
MuG-QA: Multilingual Grammatical Question Answering for RDF Data
2018 IEEE International Conference on Progress in Informatics and Computing (PIC) 2018;():57-61 2018 DOI: 10.1109/PIC.2018.8706310 · Ref ID: 6362 We introduce Multilingual Grammatical Question Answering (MuG-QA), a system for answering questions in the English, German, Italian and French languages over DBpedia. The natural language modelling and parsing is implemented using Grammatical Framework (GF), a grammar formalism having natural support for multilinguality. The question analysis is based on forming an abstract conceptual grammar from the questions, and then using linearisation of the abstract grammar into different languages to parse the questions. Once a natural language question is parsed, the resulting abstract grammar tree is matched with the knowledge base schema and contents to formulate a SPARQL query. A particular strength of our approach is that once the abstract grammar has been designed, implementation for a new concrete language is relatively quick, supposing that the language has basic support in the GF Resource Grammar Library. MuG-QA has been tested with data from the QALD-7 benchmark and showed competitive results. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#2779
-
Zong 2002
Partitioning the UMLS semantic network
IEEE Transactions on Information Technology in Biomedicine 2002;6(2):102-108 2002 DOI: 10.1109/TITB.2002.1006296 · Ref ID: 6479 The unified medical language system (UMLS) integrates many well-established biomedical terminologies. The UMLS semantic network (SN) can help orient users to the vast knowledge content of the UMLS metathesaurus (META) via its abstract conceptual view. However, the SN itself is large and complex and may still be difficult to comprehend. Our technique partitions the SN into smaller meaningful units amenable to display on limited-sized computer screens. The basis for the partitioning is the distribution of the relationships within the SN. Three rules are applied to transform the original partition into a second more cohesive partition. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#402
-
Zou 2023
K-DLM: A Domain-Adaptive Language Model Pre-Training Framework with Knowledge Graph
32nd International Conference on Artificial Neural Networks (ICANN) 2023;14257():447-459 Heraklion, GREECE Springer International Publishing Ag 2023 DOI: 10.1007/978-3-031-44216-2_37 · Ref ID: 2923 Despite the excellent performance of pre-trained language models, such as BERT, on various natural language processing tasks, they struggle with tasks that require domain-specific knowledge. Integrating information from knowledge graphs through pre-training tasks is a common approach. However, existing models tend to focus on entity information at the word level and fail to capture the rich information in knowledge graphs. To address this issue, we propose a domain-adaptive language model pre-training framework with a knowledge graph (KDLM). K-DLM can learn both word and lexical-semantic level entity information and relationships from the knowledge graph. It predicts entity categories and sememes for masked phrases, replaces entities in sentences according to the knowledge graph, and learns relationship information via contrastive learning. The evaluation on open-domain and domain-specific tasks demonstrates that K-DLM outperforms previous models, particularly in domain-specific contexts. Our findings highlight K-DLM as an excellent pre-training framework for knowledge-driven problems that leverage domain knowledge graphs. |
Xinchen
voted
Kwesi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3768
-
Zou 2024
PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models
arXiv 2024;(): 2024 Ref ID: 8095 Large language models (LLMs) have achieved remarkable success due to their exceptional generative capabilities. Despite their success, they also have inherent limitations such as a lack of up-to-date knowledge and hallucination. Retrieval-Augmented Generation (RAG) is a state-of-the-art technique to mitigate these limitations. The key idea of RAG is to ground the answer generation of an LLM on external knowledge retrieved from a knowledge database. Existing studies mainly focus on improving the accuracy or efficiency of RAG, leaving its security largely unexplored. We aim to bridge the gap in this work. We find that the knowledge database in a RAG system introduces a new and practical attack surface. Based on this attack surface, we propose PoisonedRAG, the first knowledge corruption attack to RAG, where an attacker could inject a few malicious texts into the knowledge database of a RAG system to induce an LLM to generate an attacker-chosen target answer for an attacker-chosen target question. We formulate knowledge corruption attacks as an optimization problem, whose solution is a set of malicious texts. Depending on the background knowledge (e.g., black-box and white-box settings) of an attacker on a RAG system, we propose two solutions to solve the optimization problem, respectively. Our results show PoisonedRAG could achieve a 90% attack success rate when injecting five malicious texts for each target question into a knowledge database with millions of texts. We also evaluate several defenses and our results show they are insufficient to defend against PoisonedRAG, highlighting the need for new defenses. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1719
-
Zuo 2022
Patent-KG: Patent Knowledge Graph Extraction for Engineering Design
Proceedings of the Design Society 2022;2():821-830 Cambridge University Press 2022 DOI: 10.1017/pds.2022.84 · Ref ID: 5380 This paper builds a patent-based knowledge graph, patent-KG, to represent the knowledge facts in patents for engineering design. The arising patent-KG approach proposes a new unsupervised mechanism to extract knowledge facts in a patent, by searching the attention graph in language models. The extracted entities are compared with other benchmarks in the criteria of recall rate. The result reaches the highest 0.8 recall rate in the standard list of mechanical engineering related technical terms, which means the highest coverage of engineering words. © The Author(s), 2022. |
Mike
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#3754
-
Zuo 2021
Patent-KG: Patent Knowledge Graph Use for Engineering Design
arXiv 2021;(): 2021 Ref ID: 7476 To facilitate knowledge reuse in engineering design, several dataset approaches have been proposed and applied by designers. This paper builds a patent-based knowledge graph, patent-KG, to represent the knowledge facts in patents for engineering design. The arising patent-KG approach proposes a new unsupervised mechanism to extract knowledge facts in a patent, by searching the attention graph in language models. This method avoids using expensive labelled data in supervised learning or listing complex syntactic rules in rule-based extraction. The extracted entities are compared with other benchmarks in the criteria of recall rate. The result reaches the highest 0.9 recall rate in the standard list of mechanical engineering related technical terms, which means the highest coverage of engineering words. The extracted relationships are also compared with other benchmarks. The result shows that our method provides more contextual information in relationships, and extracts more relationship types including positional and negation relationships. |
Mike
voted
Xinchen
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1946
Text2Story 2024 - Proceedings of Text2Story: 7th Workshop on Narrative Extraction From Texts, held in conjunction with the 46th European Conference on Information Retrieval, ECIR 2024
CEUR Workshop Proceedings 2024;3671(): CEUR-WS 2024 Ref ID: 4656 The proceedings contain 13 papers. The topics discussed include: dataset annotation and model building for identifying biases in news narratives; evaluating the ability of computationally extracted narrative maps to encode media framing; from nodes to narratives: a knowledge graph-based storytelling approach; estimating narrative durations: proof of concept; ROGER: extracting narratives using large language models from Robert Gerstmann's historical photo archive of the Sacambaya Expedition in 1928; representing complex relative chronology across narrative levels in movie plots; untangling a web of temporal relations in news articles; the geography of ‘fear’, ‘sadness’, ‘anger’ and ‘joy’: exploring the emotional landscapes in the holocaust survivors’ testimonies; and unexpected gender stereotypes in AI-generated stories: hairdressers are female, but so are doctors. |
Mike
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1933
TEICAI 2024 - 1st Workshop Towards Ethical and Inclusive Conversational AI: Language Attitudes, Linguistic Diversity, and Language Rights, Proceedings of the Workshop
TEICAI 2024 - 1st Workshop Towards Ethical and Inclusive Conversational AI: Language Attitudes, Linguistic Diversity, and Language Rights, Proceedings of the Workshop 2024;(): Association for Computational Linguistics (ACL) 2024 Ref ID: 4750 The proceedings contain 7 papers. The topics discussed include: how do conversational agents in healthcare impact on patient agency?; why academia should cut back general enthusiasm about CAs; bridging the language gap: integrating language variations into conversational ai agents for enhanced user engagement; socio-cultural adapted chatbots: harnessing knowledge graphs and large language models for enhanced context awareness; how should conversational agent systems respond to sexual harassment?; non-referential functions of language in social agents: the case of social proximity; and making a long story short in conversation modeling. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1883
SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval
SIGIR 2024 - Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval 2024;(): Association for Computing Machinery, Inc 2024 Ref ID: 3951 The proceedings contain 380 papers. The topics discussed include: TRAD: enhancing LLM agents with step-wise thought retrieval and aligned decision; CorpusLM: towards a unified language model on corpus for knowledge-intensive tasks; a setwise approach for effective and highly efficient zero-shot ranking with large language models; unsupervised large language model alignment for information retrieval via contrastive feedback; METAHKG: meta hyperbolic learning for few-shot temporal reasoning; transformer-based reasoning for learning evolutionary chain of events on temporal knowledge graph; contrast then memorize: semantic neighbor retrieval-enhanced inductive multimodal knowledge graph completion; and Amazon-KG: a knowledge graph enhanced cross-domain recommendation dataset. |
Mike
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1877
SemTech4STLD 2024 - 2nd International Workshop on Semantic Technologies and Deep Learning Models for Scientific, Technical and Legal Data, co-located with the Extended Semantic Web Conference 2024, ESWC 2024
CEUR Workshop Proceedings 2024;3697(): CEUR-WS 2024 Ref ID: 4563 The proceedings contain 8 papers. The topics discussed include: GerPS-NER: a dataset for named entity recognition to support public service process creation in Germany; ChatGPT vs. Google Gemini: assessing AI frontiers for patent prior art search using European search reports; PRICER: leveraging few-shot learning with fine-tuned large language models for unstructured economic data; extracting license information from web resources with a large language model; investigating environmental, social, and governance (ESG) discussions in news: a knowledge graph analysis empowered by AI; bridging the innovation gap: leveraging patent information for scientists by constructing a patent-centric knowledge graph; automating citation placement with natural language processing and transformers; and combining knowledge graphs and large language models to ease knowledge access in software architecture research. |
Mike
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1875
SemTab 2022 - Proceedings of the Semantic Web Challenge on Tabular Data to Knowledge Graph Matching, co-located with the 21st International Semantic Web Conference, ISWC 2022
CEUR Workshop Proceedings 2022;3320(): CEUR-WS 2022 Ref ID: 5487 The proceedings contain 13 papers. The topics discussed include: results of SemTab 2022; SOTAB: the WDC Schema.org table annotation benchmark; Wikary: a dataset of N-ary Wikipedia tables matched to qualified Wikidata statements; MammoTab: a giant and comprehensive dataset for semantic table interpretation; a large scale corpus of food composition tables; KGCODE-Tab results for SemTab 2022; from heuristics to language models: a journey through the universe of semantic table interpretation with DAGOBAH; s-elBat: a semantic interpretation approach for Messy taBle-s; JenTab: do CTA solutions affect the entire scores?; yet another milestone for Kepler-aSI at SemTab 2022; a low-resource approach to SemTab 2022; and towards an approach based on knowledge graph refinement for tabular data to knowledge graph matching. |
Mike
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1874
SEMPDW 2022 - Proceedings of Poster and Demo Track and Workshop Track of the 18th International Conference on Semantic Systems, co-located with 18th International Conference on Semantic Systems, SEMANTiCS 2022
CEUR Workshop Proceedings 2022;3235(): CEUR-WS 2022 Ref ID: 5429 The proceedings contain 28 papers. The topics discussed include: attribute-based access control on solid pods using privacy-friendly credentials; language-agnostic knowledge graphs for smarter multilingual chatbots; solid proof of concept in an enterprise loan request use case; applying a mapping quality framework in cloud native monitoring; misinformation detection: using linguistic cues; a semantic policy language for usage control; proposal for PORQUE, a polylingual hybrid question answering system; Wikibase as an infrastructure for community documents: the example of the disability wiki platform; combining knowledge graphs and language models to answer questions over tables; semantifying the governance of data in Europe; towards a knowledge access & representation layer; and towards knowledge graph based services in accounting use cases. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1873
SEMPDS 2023 - Proceedings of the Posters and Demo Track of the 19th International Conference on Semantic Systems, co-located with 19th International Conference on Semantic Systems, SEMANTiCS 2023
CEUR Workshop Proceedings 2023;3526(): CEUR-WS 2023 Ref ID: 5194 The proceedings contain 9 papers. The topics discussed include: a framework generate, store, and publish FAIR data in experimental sciences; a mapping lifecycle for public procurement data; a toolset for normative interpretations in FLINT; developing a scalable benchmark for assessing large language models in knowledge graph engineering; enhancing interpretability of machine learning models over knowledge graphs; OntoAnon: an anonymizer for sharing ontology structure without data; SPARQLGEN: one-shot prompt-based approach for SPARQL query generation; and towards assessing FAIR research software best practices in an organization using RDF-star. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1852
Scholarly QALD 2023 and SemREC 2023 - Joint Proceedings of 1st Scholarly QALD Challenge 2023 and 4th SeMantic Answer Type, Relation and Entity Prediction Tasks Challenge 2023, co-located with 22nd International Semantic Web Conference, ISWC 2023
CEUR Workshop Proceedings 2023;3592(): CEUR-WS 2023 Ref ID: 5070 The proceedings contain 9 papers. The topics discussed include: when context matters: entity linking in the scholarly domain; NLQxform: a language model-based question to SPARQL transformer; a structure and content prompt-based method for knowledge graph question answering over scholarly data; leveraging LLMs in scholarly knowledge graph question answering; improving subgraph extraction algorithms for one-shot SPARQL query generation with large language models; PSYCHIC: a neuro-symbolic framework for knowledge graph question-answering grounding; BERTologyNavigator: advanced question answering with BERT-based semantics; enhanced GAT: expanding receptive field with meta path-guided RDF rules for two-hop connectivity; and evaluating different methods for semantic reasoning over ontologies. |
Mike
voted
Davis
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1757
Proceedings of 2023 SC Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, SC Workshops 2023
ACM International Conference Proceeding Series 2023;(): Association for Computing Machinery 2023 Ref ID: 4760 The proceedings contain 255 papers. The topics discussed include: a comparison of mesh-free differentiable programming and data-driven strategies for optimal control under PDE constraints; autotuning Apache TVM-based scientific applications using Bayesian optimization; enhancing heterogeneous federated learning with knowledge extraction and multi-model fusion; elastic deep learning through resilient collective operations; accelerating particle and fluid simulations with differentiable and interpretable graph networks for solving forward and inverse problems; machine learning applied to single-molecule activity prediction; entropy-driven optimal sub-sampling of fluid dynamics for developing machine-learned surrogates; towards rapid autonomous electron microscopy with active meta-learning; protein generation via genome-scale language models with bio-physical scoring; Tencoder: tensor-product encoder-decoder architecture for predicting solutions of PDEs with variable boundary data; and AI/ML-derived whole-genome predictor prospectively and clinically predicts survival and response to treatment in brain cancer. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1749
Proceedings - 9th IEEE European Symposium on Security and Privacy Workshops, Euro S and PW 2024
Proceedings - 9th IEEE European Symposium on Security and Privacy Workshops, Euro S and PW 2024 2024;(): Institute of Electrical and Electronics Engineers Inc. 2024 Ref ID: 4492 The proceedings contain 85 papers. The topics discussed include: differentially private multi-label learning is harder than you’d think; attacking operational technology without specialized knowledge: the unspecialized OT threat actor profile; towards an integrated provenance framework - a scenario for marine data; better left shift security! framework for secure software development; are you sure you want to do coordinated vulnerability disclosure?; actionable cyber threat intelligence using knowledge graphs and large language models; optimal flow collector placement in experimental networks; and a methodology to measure the ‘cost’ of cps attacks: not all CPS networks are created equal. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1690
NLP4ConvAI 2023 - 5th Workshop on NLP for Conversational AI, Proceedings of the Workshop
Proceedings of the Annual Meeting of the Association for Computational Linguistics 2023;(): Association for Computational Linguistics (ACL) 2023 Ref ID: 5138 The proceedings contain 13 papers. The topics discussed include: response generation in longitudinal dialogues: which knowledge representation helps?; on the underspecification of situations in open-domain conversational datasets; correcting semantic parses with natural language through dynamic schema encoding; dialogue state tracking with sparse local slot attention; LLM-Eval: unified multi-dimensional automatic evaluation for open-domain conversations with large language models; cTBLS: augmenting large language models with conversational tables; user simulator assisted open-ended conversational recommendation system; evaluating inter-bilingual semantic parsing for Indian languages; zero-shot dialogue relation extraction by relating explainable triggers and relation names; generating video game scripts with style; and a survey of challenges and methods in the computational modeling of multi-party dialog. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1680
NeSy 2023 - Proceedings of the 17th International Workshop on Neural-Symbolic Learning and Reasoning
CEUR Workshop Proceedings 2023;3432(): CEUR-WS 2023 Ref ID: 5254 The proceedings contain 33 papers. The topics discussed include: a roadmap for neuro-argumentative learning; what's wrong with gradient-based complex query answering?; closing the neural-symbolic cycle: knowledge extraction, user intervention and distillation from convolutional neural networks; the challenge of learning symbolic representations; exploring mathematical conjecturing with large language models; learning logic constraints from demonstration; from axioms over graphs to vectors, and back again: evaluating the properties of graph-based ontology embeddings; neural-symbolic predicate invention: learning relational concepts from visual scenes; semantic interpretability of convolutional neural networks by taxonomy extraction; preliminary results on a state-driven method for rule construction in neural-symbolic reinforcement learning; and is the proof length a good indicator of hardness for reason-able embeddings?. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1668
NAIS 2023 - Proceedings of the 5th Symposium of the Norwegian AI Society
CEUR Workshop Proceedings 2023;3431(): CEUR-WS 2023 Ref ID: 5301 The proceedings contain 10 papers. The topics discussed include: crowd simulation with deliberative-reactive agents; generating natural language dialogues using large language models with adapters; the AI Act and the risks posed by generative AI models; Bayesian exploration in deep reinforcement learning; analyzing literary texts in Lithuanian sign language with computer vision: a proof of concept; automatic detection of manipulative consent management platforms and the journey into the patterns of darkness; EvoLP.jl: a playground for evolutionary computation in Julia; making sense of nonsense: integrated gradient-based input reduction to improve recall for check-worthy claim detection; construction of a relevance knowledge graph with application to the LOCAL news angle; and container-based IoT architectures: use case for visual person counting. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1649
MLSMKG 2021 - Machine Learning with Symbolic Methods and Knowledge Graphs, co-located with European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2021
CEUR Workshop Proceedings 2021;2997(): CEUR-WS 2021 Ref ID: 5670 The proceedings contain 4 papers. The topics discussed include: ontology-based N-ball concept embeddings informing few-shot image classification; contextual graph representation learning in text disambiguation; contextual language models for knowledge graph completion; and on refining BERT contextualized embeddings using semantic lexicons. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1551
Lang + Mol 2024 - 1st Workshop on Language + Molecules, Proceedings of the Workshop
Lang + Mol 2024 - 1st Workshop on Language + Molecules, Proceedings of the Workshop 2024;(): Association for Computational Linguistics (ACL) 2024 Ref ID: 4212 The proceedings contain 15 papers. The topics discussed include: could chemical language models benefit from message passing; ALMol: aligned language-molecule translation LLMs through offline preference contrastive optimization; evaluating extrapolation ability of large language model in chemical domain; design proteins using large language models: enhancements and comparative analyses; enhanced biot5+ for molecule-text translation: a three-stage approach with data distillation, di verse training, and voting ensemble; SciMind: a multimodal mixture-of-experts model for advancing pharmaceutical sciences; knowledge graph extraction from total synthesis documents; and Knowlab’s submission to L+M shared task: all you need is continued pretraining of chemistry texts even for molecule captioning. |
mohammed afaan
voted
Ishan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1388
ICISE 2021 - 2021 6th International Conference on Information Systems Engineering
ACM International Conference Proceeding Series 2021;(): Association for Computing Machinery 2021 Ref ID: 5539 The proceedings contain 18 papers. The topics discussed include: big data: finding frequencies of faulty multimedia data; financial big data security and privacy in x-accounting. a step further to implement the triple-entry accounting; research on real-time data warehouse technology for sea battlefield; business planning and big data, budget modelling upgrade through data science; predicting the total population development of China based on logistic blocking growth model and improved grey GM (1,1) prediction model; learning knowledge uncertainty from the pretrained language model; dual-channel BERT-DBLCA based on attention mechanism for news category label classification model; research on the development of key technologies of tactical edge cloud; and research on method of undesirable text recognition based on deep learning and knowledge graph. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1233
EKG-LLM 2023 - Proceedings of the Workshop on Enterprise Knowledge Graphs using Large Language Models, co-located with 32nd ACM International Conference on Information and Knowledge Management, CIKM 2023
CEUR Workshop Proceedings 2023;3532(): CEUR-WS 2023 Ref ID: 5156 The proceedings contain 6 papers. The topics discussed include: EduEmbedd - a knowledge graph embedding for education; related table search for numeric data using large language models and enterprise knowledge graphs; cognitive retrieve: empowering document retrieval with semantics and domain specific knowledge graph; CRUSH: cybersecurity research using universal LLMs and semantic hypernetworks; and StATIK+: structure and text for inductive knowledge graph modeling and paths towards enterprise implementations. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1232
EKAW-C 2022 - Companion Proceedings of the 23rd International Conference on Knowledge Engineering and Knowledge Management
CEUR Workshop Proceedings 2022;3256(): CEUR-WS 2022 Ref ID: 5500 The proceedings contain 11 papers. The topics discussed include: experiment maker: a tool to create experiments with GPT-3 Easily; CrowdIQ: an ontology for crowdsourced information quality assessments; automated identification of flaky builds using knowledge graphs; ATONTE: towards a new methodology for seed ontology development from texts and experts; a step toward semantic content negotiation; FAIR ontologies, FAIR ontology alignments; extracting structured knowledge from Dutch legal texts: a rule-based approach; knowledge-based legal document retrieval: a case study on Italian civil court decisions; ITALIAN-LEGAL-BERT: a pre-trained transformer language model for Italian law; and public procurement fraud detection and artificial intelligence techniques: a literature review. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1202
DL4KG 2023 - Proceedings of the Workshop on Deep Learning for Knowledge Graphs, co-located with the 21st International Semantic Web Conference, ISWC 2023
CEUR Workshop Proceedings 2023;3559(): CEUR-WS 2023 Ref ID: 5052 The proceedings contain 7 papers. The topics discussed include: location query answering using box embeddings; knowledge graph injection for reinforcement learning; benchmarking the abilities of large language models for RDF knowledge graph creation and comprehension: how well do LLMs speak turtle?; enhancing large language models with knowledge graphs for classification tasks in the tourism domain; universal preprocessing operators for embedding knowledge graphs with literals; NNKGC: improving knowledge graph completion with node neighborhoods; and enhancing scholarly understanding: a comparison of knowledge injection strategies in large language models. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1168
Deep Learning Inside Out: 2nd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, DeeLIO 2021 - Proceedings, co-located with the Annual Conference of the North American Chapter of the Association for Computational Linguistics
Deep Learning Inside Out: 2nd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, DeeLIO 2021 - Proceedings, co-located with the Annual Conference of the North American Chapter of the Association for Computational Linguistics 2021;(): Association for Computational Linguistics (ACL) 2021 Ref ID: 5666 The proceedings contain 14 papers. The topics discussed include: transformer visualization via dictionary learning: contextualized embedding as a linear superposition of transformer factors; reconstructing implicit knowledge with language models; investigating the effect of background knowledge on natural questions; augmenting topic aware knowledge-grounded conversations with dynamic built knowledge graphs; what makes my model perplexed? a linguistic investigation on neural language models perplexity; how do BERT embeddings organize linguistic knowledge?; enhancing multiple-choice question answering with causal knowledge; and low anisotropy sense retrofitting (LASeR) : towards isotropic and sense enriched representations. |
Ishan
voted
Srividya
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1157
D2R2 2024 - Proceedings of the 3rd International Workshop on Linked Data-Driven Resilience Research, co-located with European Semantic Web Conference 2024, ESWC 2024
CEUR Workshop Proceedings 2024;3707(): CEUR-WS 2024 Ref ID: 4538 The proceedings contain 8 papers. The topics discussed include: empowering supply chains resilience: LLMs-powered BN for proactive supply chain risk identification; anticipate risk with the value and trade flows knowledge graph; entity alignment for knowledge graphs in the context of supply chain risk management; leveraging small language models for Text2SPARQL tasks to improve the resilience of AI assistance; towards a regional public dashboard for crisis and resilience management; an automated evaluation framework for graph database query generation leveraging large language models; and towards modeling the structure of product dependencies in supply networks to identify bottlenecks among suppliers. |
Srividya
voted
Mike
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1118
CONDA 2024 - 1st Data Contamination Workshop, Proceedings of the Workshop
CONDA 2024 - 1st Data Contamination Workshop, Proceedings of the Workshop 2024;(): Association for Computational Linguistics (ACL) 2024 Ref ID: 4383 The proceedings contain 4 papers. The topics discussed include: evaluating Chinese large language models on discipline knowledge acquisition via memorization and robustness assessment; confounders in instance variation for the analysis of data contamination; a taxonomy for data contamination in large language models; and data contamination report from the 2024 CONDA shared task. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1082
CMLDS 2024 - 2024 International Conference on Computing, Machine Learning and Data Science, Conference Proceedings
ACM International Conference Proceeding Series 2024;(): Association for Computing Machinery 2024 Ref ID: 4031 The proceedings contain 60 papers. The topics discussed include: privacy-preservation robust federated learning with blockchain-based hierarchical framework; spatio-temporal hypergraph convolutional network based network traffic prediction; hash function based on quantum walks with two-step memory; adversarial analysis and methods for math word problems; object tracking based on adaptive multi-template fusing; analysis of spatial-temporal variability and heterogeneity of soil moisture; can deep learning large language models be used to unravel knowledge graph creation?; binary and multi-label machine learning models for discrete-time survival analysis: a case study to predict complications and mortality in Thai diabetic patients; power factor anomaly detection using data stream summaries; and consensus filter for distributed sensor networks with unknown colored noise. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1057
Case-Based Reasoning Research and Development - 31st International Conference, ICCBR 2023, Proceedings
Lect. Notes Comput. Sci. 2023;14141 LNAI(): 2023 Ref ID: 5296 The proceedings contain 26 papers. The special focus in this conference is on Case-Based Reasoning Research and Development. The topics include: CBR Driven Interactive Explainable AI; selecting Explanation Methods for Intelligent IoT Systems: A Case-Based Reasoning Approach; CBR-fox: A Case-Based Explanation Method for Time Series Forecasting Models; group Fairness in Case-Based Reasoning; Addressing Underestimation Bias in CBR Through Case-Base Maintenance; towards Addressing Problem-Distribution Drift with Case Discovery; case-Based Adaptation of Argument Graphs with WordNet and Large Language Models; Failure-Driven Transformational Case Reuse of Explanation Strategies in CloodCBR; a Case-Based Approach for Workflow Flexibility by Deviation; synergies Between Case-Based Reasoning and Deep Learning for Survival Analysis in Oncology; lazy Adaptation Knowledge Learning Based on Frequent Closed Itemsets; an Overview and Comparison of Case-Based Reasoning Frameworks; case-Based Cleaning of Text Images; a Multi-agent Case-Based Reasoning Intrusion Detection System Prototype; a Case-Based Reasoning Approach to Company Sector Classification Using a Novel Time-Series Case Representation; an Integrated Approach to Predicting the Influence of Reputation Mechanisms on Q&A Communities; retrieval of Similar Cases to Improve the Diagnosis of Diabetic Retinopathy; CBR Assisted Context-Aware Surface Realisation for Data-to-Text Generation; explanation of Similarities in Process-Oriented Case-Based Reasoning by Visualization; on-Demand and Model-Driven Case Building Based on Distributed Data Sources; the Case for Circularities in Case-Based Reasoning; a Contextual Information-Augmented Probabilistic Case-Based Reasoning Model for Knowledge Graph Reasoning; case-Based Sample Generation Using Multi-Armed Bandits; hybrid Event Memory as a Case Base for State Estimation in Cognitive Agents. |
Xinchen
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#1045
CAiSE-DC 2024 - Proceedings of the Doctoral Consortium Papers Presented at the 36th International Conference on Advanced Information Systems Engineering
CEUR Workshop Proceedings 2024;3767(): CEUR-WS 2024 Ref ID: 4135 The proceedings contain 8 papers. The topics discussed include: from adoption to endurance: exploring the dynamics of general-purpose AI adoption across time and contexts; intelligent perception systems for multi-modal data processing in industrial application contexts; a conceptual modeling-based journey into variant interpretation: from unpacking to operationalization; a methodological approach to model-driven software development for quality assurance in metaverse environments; integrating LLMs with knowledge graphs-enhanced task-oriented dialogue systems; translating polygenic risk score research to a clinical setting; comparable and repeatable information security level evaluation; and selecting adequate machine learning methods for human-computer interaction data sets: guidelines and a conceptual structure. |
mohammed afaan
voted
yuexi
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#905
8th China Conference on Knowledge Graph and Semantic Computing, CCKS 2023
Commun. Comput. Info. Sci. 2023;1923 CCIS(): 2023 Ref ID: 5148 The proceedings contain 28 papers. The special focus in this conference is on Knowledge Graph and Semantic Computing. The topics include: A Generalized Strategy of Chinese Grammatical Error Diagnosis Based on Task Decomposition and Transformation; conversational Search Based on Utterance-Mask-Passage Post-training; financial Fraud Detection Based on Deep Learning: Towards Large-Scale Pre-training Transformer Models; GERNS: A Graph Embedding with Repeat-Free Neighborhood Structure for Subgraph Matching Optimization; feature Enhanced Structured Reasoning for Question Answering; conditional Knowledge Graph: Design, Dataset and a Preliminary Model; ODKG: An Official Document Knowledge Graph for the Effective Management; CCD-ASQP: A Chinese Cross-Domain Aspect Sentiment Quadruple Prediction Dataset; move Structure Recognition in Scientific Papers with Saliency Attribution; causE: Towards Causal Knowledge Graph Embedding; Moral Essential Elements: MEE-A Dataset for Moral Judgement; improving Adaptive Knowledge Graph Construction via Large Language Models with Multiple Views; single Source Path-Based Graph Neural Network for Inductive Knowledge Graph Reasoning; a Graph Learning Based Method for Inductive Knowledge Graph Relation Prediction; LLM-Based SPARQL Generation with Selected Schema from Large Scale Knowledge Base; Robust NL-to-Cypher Translation for KBQA: Harnessing Large Language Model with Chain of Prompts; in-Context Learning for Knowledge Base Question Answering for Unmanned Systems Based on Large Language Models; a Military Domain Knowledge-Based Question Answering Method Based on Large Language Model Enhancement; Advanced PromptCBLUE Performance: A Novel Approach Leveraging Large Language Models; exploring the Logical Expressiveness of Graph Neural Networks by Establishing a Connection with C2 ; research on Joint Representation Learning Methods for Entity Neighborhood Information and Description Information; harvesting Event Schemas from Large Language Models; NTDA: Noise-Tolerant Data Augmentation for Document-Level Event Argument Extraction; Event-Centric Opinion Mining via In-Context Learning with ChatGPT; relation Repository Based Adaptive Clustering for Open Relation Extraction. |
yuexi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#902
6th China Conference on Knowledge Graph and Semantic Computing, CCKS 2021
Commun. Comput. Info. Sci. 2022;1553 CCIS(): 2022 Ref ID: 5514 The proceedings contain 17 papers. The special focus in this conference is on Knowledge Graph and Semantic Computing. The topics include: Enhance Both Text and Label: Combination Strategies for Improving the Generalization Ability of Medical Entity Extraction; knowledge-Enhanced Retrieval: A Scheme for Question Answering; multi-label Fine-Grained Entity Typing for Baidu Wikipedia Based on Pre-trained Model; multi-strategies Integrated Information Extraction for Scholar Profiling Task; named Entity Recognition and Event Extraction in Chinese Electronic Medical Records; strategies for Enhancing Generalization Ability of Communication Event Co-reference Resolution; Unmanned Aerial Vehicle Knowledge Graph Construction with SpERT; Method Description for CCKS 2021 Task 3: A Classification Approach of Scholar Structured Information Extraction from HTML Web Pages; a Dual-Classifier Model for General Fine-Grained Event Detection Task; a Joint Training Framework Based on Adversarial Perturbation for Video Semantic Tags Classification; a Multi-modal System for Video Semantic Understanding; an Integrated Method of Semantic Parsing and Information Retrieval for Knowledge Base Question Answering; Basic Profiling Extraction Based on XGBoost; data Augmentation Based on Pre-trained Language Model for Event Detection; Does BERT Know Which Answer Beyond the Question?. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#901
4th International Conference on Cognitive Computing, ICCC 2020, held as part of Services Conference Federation, SCF 2020
Lect. Notes Comput. Sci. 2020;12408 LNCS(): 2020 Ref ID: 5714 The proceedings contain 10 papers. The special focus in this conference is on Cognitive Computing. The topics include: PRTransE: Emphasize More Important Facts Based on Pagerank for Knowledge Graph Completion; context Based Quantum Language Model with Application to Question Answering; improving Fake Product Detection with Aspect-Based Sentiment Analysis; a Dual Layer Regression Model for Cross-border E-commerce Industry Sale and Hot Product Prediction; end-to-End Nested Multi-Attention Network for 3D Brain Tumor Segmentation; ALBERT-Based Chinese Named Entity Recognition; cognitive and Predictive Analytics on Big Open Data; semantic Enhancement Based Dynamic Construction of Domain Knowledge Graph. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#959
42nd European Conference on IR Research, ECIR 2020
Lect. Notes Comput. Sci. 2020;12036 LNCS(): 2020 Ref ID: 5791 The proceedings contain 144 papers. The special focus in this conference is on IR Research. The topics include: Neural embedding-based metrics for pre-retrieval query performance prediction; a latent model for ad hoc table retrieval; hybrid semantic recommender system for chemical compounds; Assessing the impact of OCR errors in information retrieval; towards query logs for privacy studies: On deriving search queries from questions; machine-actionable data management plans: A knowledge retrieval approach to automate the assessment of funders’ requirements; session-based path prediction by combining local and global content preferences; unsupervised ensemble of ranking models for news comments using pseudo answers; irony detection in a multilingual context; document network projection in pretrained word embedding space; the effect of content-equivalent near-duplicates on the evaluation of search engines; supervised learning methods for diversification of image search results; ANTIQUE: A non-factoid question answering benchmark; neural query-biased abstractive summarization using copying mechanism; distant supervision for extractive question summarization; text-image-video summary generation using joint integer linear programming; domain adaptation via context prediction for engineering diagram search; crowdsourcing truthfulness: The impact of judgment scale and assessor bias; novel and diverse recommendations by leveraging linear models with user and item embeddings; a multi-task approach to open domain suggestion mining using language model for text over-sampling; medLinker: Medical entity linking with neural representations and dictionary matching; From MaxSCORE to block-max WAND: The story of how lucene significantly improved query evaluation performance; ranking significant discrepancies in clinical reports; teaching a new dog old tricks: Resurrecting multilingual retrieval using zero-shot learning. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#942
21st International Semantic Web Conference, ISWC 2022
Lect. Notes Comput. Sci. 2022;13489 LNCS(): 2022 Ref ID: 5454 The proceedings contain 48 papers. The special focus in this conference is on International Semantic Web. The topics include: H2TNE : Temporal Heterogeneous Information Network Embedding in Hyperbolic Spaces; facing Changes: Continual Entity Alignment for Growing Knowledge Graphs; Mapping Relational Database Constraints to SHACL; POSO: A Generic Positioning System Ontology; each Snapshot to Each Space: Space Adaptation for Temporal Knowledge Graph Completion; efficient Dependency Analysis for Rule-Based Ontologies; heterogeneous Graph Neural Network with Hypernetworks for Knowledge Graph Embedding; MultPAX: Keyphrase Extraction Using Language Models and Knowledge Graphs; RT-KGD: Relation Transition Aware Knowledge-Grounded Dialogue Generation; Faithful Embeddings for EL+ + Knowledge Bases; LoGNet: Local and Global Triple Embedding Network; an Analysis of Content Gaps Versus User Needs in the Wikidata Knowledge Graph; Repairing SHACL Constraint Violations Using Answer Set Programming; entity Type Prediction Leveraging Graph Walks and Entity Descriptions; Strabo 2: Distributed Management of Massive Geospatial RDF Datasets; Controlled Query Evaluation in OWL 2 QL: A “Longest Honeymoon” Approach; a Survey of Syntactic Modelling Structures in Biomedical Ontologies; HybridFC: A Hybrid Fact-Checking Approach for Knowledge Graphs; GNNQ: A Neuro-Symbolic Approach to Query Answering over Incomplete Knowledge Graphs; Radar Station: Using KG Embeddings for Semantic Table Interpretation and Entity Disambiguation; enhancing Document-Level Relation Extraction by Entity Knowledge Injection; CRNet: Modeling Concurrent Events over Temporal Knowledge Graph; LODChain: Strengthen the Connectivity of Your RDF Dataset to the Rest LOD Cloud; WDV: A Broad Data Verbalisation Dataset Built from Wikidata; machine Learning-Friendly Biomedical Datasets for Equivalence and Subsumption Ontology Matching; The DLCC Node Classification Benchmark for Analyzing Knowledge Graph Embeddings; μKG : A Library for Multi-source Knowledge Graph Embeddings and Applications; IMGT-KG: A Knowledge Graph for Immunogenetics; REBench: Microbenchmarking Framework for Relation Extraction Systems; WDBench: A Wikidata Graph Query Benchmark. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#940
21st International Conference on Service-Oriented Computing, ICSOC 2023
Lect. Notes Comput. Sci. 2023;14420 LNCS(): 2023 Ref ID: 5037 The proceedings contain 48 papers. The special focus in this conference is on Service-Oriented Computing. The topics include: IDLGen: Automated Code Generation for Inter-parameter Dependencies in Web APIs; time-Aware Log Anomaly Detection Based on Growing Self-organizing Map; an Empirical Evaluation of the Energy and Performance Overhead of Monitoring Tools on Docker-Based Systems; chainsFormer: A Chain Latency-Aware Resource Provisioning Approach for Microservices Cluster; energy-Efficient and Communication-Aware Resource Allocation in Container-Based Cloud with Group Genetic Algorithm; engineering Self-adaptive Microservice Applications: An Experience Report; FUSE: Fault Diagnosis and Suppression with eBPF for Microservices; serviceSim: A Modelling and Simulation Toolkit of Microservice Systems in Cloud-Edge Environment; 2DPChain: Orchestrating Transactions in Order-Execute Blockchain to Exploit Intra-batch and Inter-batch Parallelism; deep Learning Model for Personalized Web Service Recommendations Using Attention Mechanism; a Dynamical Model for the Nonlinear Features of Value-Driven Service Ecosystem Evolution; a Middleware for Hybrid Blockchain Applications: Towards Fast, Affordable, and Accountable Integration; An AI Chatbot for Explaining Deep Reinforcement Learning Decisions of Service-Oriented Systems; BEAR: Revolutionizing Service Domain Knowledge Graph Construction with LLM; dependency-Aware Resource Allocation for Serverless Functions at the Edge; distributing Quantum Computations, by Shots; energy-Efficient Task Offloading with Statistic QoS Constraint Through Multi-level Sleep Mode in Ultra-Dense Network; enhancing Blockchain Performance via On-chain and Off-chain Collaboration; deep Reinforcement Learning-Based Scheduling for Same Day Delivery with a Dynamic Number of Drones; designing Reconfigurable Intelligent Systems with Markov Blankets; exploiting Category Information in Sequential Recommendation; Niagara: Scheduling DNN Inference Services on Heterogeneous Edge Processors; plan, Generate and Match: Scientific Workflow Recommendation with Large Language Models; predicting Effect and Cost of Microservice System Evolution Using Graph Neural Network; qoS Prediction via Multi-scale Feature Fusion Based on Convolutional Neural Network. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#933
20th European, Mediterranean, and Middle Eastern Conference, EMCIS 2023
Lect. Notes Bus. Inf. Process. 2024;502 LNBIP(): 2024 Ref ID: 4652 The proceedings contain 43 papers. The special focus in this conference is on European, Mediterranean, and Middle Eastern. The topics include: Web Mining for Estimating Regulatory Blockchain Readiness; reviewing the Role of Secret Sharing Schemes in Electronic Payment Protocols; Decentralization of DAOs: A Fundamental Analysis; Blockchain-Powered NFTs: A Paradigm Shift in Carbon Credit Transactions for Traceability, Transparency, and Accountability; a Blockchain Framework for Digital Asset Ownership and Transfer in Succession; perspectives of Merchants Regarding Bitcoin’s Role as a Currency and Its Utility as a Payment System; a Chatbot Generator for Improved Digital Governance; A Structured Analysis of Domain-Specific Linked Open Vocabularies (LOV): Indicators for Interoperability and Reusability; predicting Digital Winners and Losers in Economic Crises Using Artificial Intelligence and Open Government Data; chatbot Technology Assessment: 40 Cases from Greece; the Effects of Economic Crisis on the Digitalization of the Greek Social Security; design, Implementation, and Evaluation of a Food Price Monitoring Tool for Supporting Data Journalists; Smartphone Apps for Parents of Preterm Infants from NICU to Home: A Quality, Evidence-Based Content and Data Protection Assessment; assessing the Progress of Portuguese Hospitals’ Online Services; Α Cross-Sector Data Space for Correlating Environmental Risks with Human Health; using Computational Knowledge Extraction Approach to Assess Three Decades of Health Management Information Systems for Informed Actions; the Role of Artificial Ethics Principles in Managing Knowledge and Enabling Data-Driven Decision Making in Supply Chain Management; fine-Tuning Large-Scale Project Scheduling; Integrating LLMs in Higher Education, Through Interactive Problem Solving and Tutoring: Algorithmic Approach and Use Cases. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#961
2024 14th International Conference on Pattern Recognition Systems, ICPRS 2024
2024 14th International Conference on Pattern Recognition Systems, ICPRS 2024 2024;(): Institute of Electrical and Electronics Engineers Inc. 2024 Ref ID: 4133 The proceedings contain 42 papers. The topics discussed include: LLM-aided knowledge graph construction for zero-shot visual object state classification; enhancing Apple’s defect classification: insights from visible spectrum and narrow spectral band imaging; analyzing emotional and topical patterns in conspiracy theory narratives: a discourse comparative study on the 2023 Hawaii wildfires; non-invasive estimation of moisture content in mushrooms using hyperspectral imaging and machine learning-based stacking regressor model; a concept drift based approach to evaluating model performance and theoretical lifespan; autism spectrum disorder prediction using machine learning classifiers; adversarial contrastive representation learning for passive Wi-Fi fingerprinting of individuals; and SAI-ChileanDiet: a multi-label food dataset with self-acquired images of the Chilean diet. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#896
1st Workshop on Bridging Neurons and Symbols for Natural Language Processing and Knowledge Graphs Reasoning, NeusymBridge 2024 at LREC-COLING 2024 - Workshop Proceedings
1st Workshop on Bridging Neurons and Symbols for Natural Language Processing and Knowledge Graphs Reasoning, NeusymBridge 2024 at LREC-COLING 2024 - Workshop Proceedings 2024;(): European Language Resources Association (ELRA) 2024 Ref ID: 4617 The proceedings contain 5 papers. The topics discussed include: probing large language models from a human behavioral perspective; the semantic relations in LLMs: an information-theoretic compression approach; word sense disambiguation as a game of neurosymbolic darts; open event causality extraction by the assistance of LLM in task annotation, dataset, and method; and the need for grounding in LLM-based dialogue systems. |
yuexi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#895
1st Working conference on Artificial Intelligence Development for a Resilient and Sustainable Tomorrow, AI Tomorrow 2023
Informatik aktuell 2024;(): Springer Science and Business Media Deutschland GmbH 2024 Ref ID: 4679 The proceedings contain 12 papers. The special focus in this conference is on Artificial Intelligence Development for a Resilient and Sustainable Tomorrow. The topics include: AI-Powered Knowledge and Expertise Mining in Healthcare from a Field Experiment; iterative Development of a Process-Oriented Approach for the Selection of Platform-Based Digital Services; classification of Static Poses Based on Key Point Detection for Application of Incriminated Image Files; Human Centered Implementation Process of AI in SMEs – Conditions for Success; LLM-assisted Knowledge Graph Engineering: Experiments with ChatGPT; Foundations for the Development of an AI-based, Platformindipendent cOmpanion-app [for] Lifelong Learning-Optimization (APOLLO); viability of Knowledge Management Practices for a Successful Digital Transformation in Small- and Medium- Sized Enterprises; identification of Machine Learning Algorithms to Share Tacit Experimental Knowledge in Manual Production; An Application of AI for Online Estimation of the Impact of Imperfections in Additive Manufactured Components. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#894
1st International Workshop on Natural Scientific Language Processing and Research Knowledge Graphs, NSLP 2024
Lect. Notes Comput. Sci. 2024;14770 LNAI(): 2024 Ref ID: 4500 The proceedings contain 21 papers. The special focus in this conference is on Natural Scientific Language Processing and Research Knowledge Graphs. The topics include: Towards a Novel Classification of Table Types in Scholarly Publications; OCR Cleaning of Scientific Texts with LLMs; RTaC: A Generalized Framework for Tooling; scientific Software Citation Intent Classification Using Large Language Models; repoFromPaper: An Approach to Extract Software Code Implementations from Scientific Publications; Automated Extraction of Research Software Installation Instructions from README Files: An Initial Analysis; a Technical/Scientific Document Management Platform; the Effect of Knowledge Graph Schema on Classifying Future Research Suggestions; assessing the Overlap of Science Knowledge Graphs: A Quantitative Analysis; FoRC@NSLP2024: Overview and Insights from the Field of Research Classification Shared Task; NRK at FoRC 2024 Subtask I: Exploiting BERT-Based Models for Multi-class Classification of Scholarly Papers; advancing Automatic Subject Indexing: Combining Weak Supervision with Extreme Multi-label Classification; single-Label Multi-modal Field of Research Classification; Enriched BERT Embeddings for Scholarly Publication Classification; SOMD@NSLP2024: Overview and Insights from the Software Mention Detection Shared Task; Software Mention Recognition with a Three-Stage Framework Based on BERTology Models at SOMD 2024; ABCD Team at SOMD 2024: Software Mention Detection in Scholarly Publications with Large Language Models; falcon 7b for Software Mention Detection in Scholarly Documents; enhancing Software-Related Information Extraction via Single-Choice Question Answering with Large Language Models. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#931
19th International Conference on Wisdom, Well-Being, Win-Win, iConference 2024
Lect. Notes Comput. Sci. 2024;14597 LNCS(): 2024 Ref ID: 4604 The proceedings contain 91 papers. The special focus in this conference is on Wisdom, Well-Being and Win-Win. The topics include: Identifying the Potential Users of Community Archives: A Case Study of the History of the Chinese 30 Years Project; What Motivates You to Use VR Exergames to Substitute for Real Sports?—An Empirical Study Based on Technology Readiness and Technology Acceptance Model; the Filtered Appeal: Evaluating the Impact of Appearance Enhancement on Effectiveness of Donation Requests; “If I Like BLANK, What Else Will I Like?”: Analyzing a Human Recommendation Community on Reddit; can Chatbot Anthropomorphism and Empathy Mitigate the Impact of Customer Anger on Satisfaction?; understanding Users’ Decision-Making on Privacy Disclosure from a Configurational Perspective Perceived Values, Privacy Concerns, Cognitive Style, and Trust; genre Recognition: A Model of Behaviour; “How I Form and Escape Information Cocoons”: An Interview Study of Users on Short Video Apps; are Older People Battling with Digital Financial Services?; plant-Based Predictions: An Exploratory Predictive Analysis of Purchasing Behavior of Meat-Alternatives by U.S. Consumers (2020); AIGC-Enabled Interdisciplinary Science Measurement; Role of Emotional Experience in AI Voice Assistant User Experience in Voice Shopping; a Contextualized Government Service Chatbot for Individuals with limited Information Literacy; Detection Vs. Anti-detection: Is Text Generated by AI Detectable?; privacyChat: Utilizing Large Language Model for Fine-Grained Information Extraction over Privacy Policies; reimagining Data Science Methodology for Community Well-Being Through Intersectional Feminist Voices; participatory Observation Methods Within Data-Intensive Science: Formal Evaluation and Sociotechnical Insight; from Knowledge Representation to Knowledge Organization and Back; understanding Researchers’ Data-Centric Tasks: A Classification of Goals, Gaps, and Resources. |
yuexi
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#930
19th China National Conference on Computational Linguistics, CCL 2020
Lect. Notes Comput. Sci. 2020;12522 LNAI(): 2020 Ref ID: 5771 The proceedings contain 34 papers. The special focus in this conference is on Computational Linguistics. The topics include: Chinese Named Entity Recognition via Adaptive Multi-pass Memory Network with Hierarchical Tagging Mechanism; a Practice of Tourism Knowledge Graph Construction Based on Heterogeneous Information; a Novel Joint Framework for Multiple Chinese Events Extraction; entity Relative Position Representation Based Multi-head Selection for Joint Entity and Relation Extraction; a Mixed Learning Objective for Neural Machine Translation; multi-reward Based Reinforcement Learning for Neural Machine Translation; low-Resource Text Classification via Cross-Lingual Language Model Fine-Tuning; constructing Uyghur Named Entity Recognition System Using Neural Machine Translation Tag Projection; recognition Method of Important Words in Korean Text Based on Reinforcement Learning; semantic-Aware Chinese Zero Pronoun Resolution with Pre-trained Semantic Dependency Parser; mongolian Questions Classification Based on Multi-Head Attention; the Annotation Scheme of English-Chinese Clause Alignment Corpus; categorizing Offensive Language in Social Networks: A Chinese Corpus, Systems and an Explanation Tool; LiveQA: A Question Answering Dataset Over Sports Live; Chinese and English Elementary Discourse Units Recognition Based on Bi-LSTM-CRF Model; better Queries for Aspect-Category Sentiment Classification; multimodal Sentiment Analysis with Multi-perspective Fusion Network Focusing on Sense Attentive Language; CAN-GRU: A Hierarchical Model for Emotion Recognition in Dialogue; a Joint Model for Aspect-Category Sentiment Analysis with Shared Sentiment Prediction Layer; compress Polyphone Pronunciation Prediction Model with Shared Labels; improving Sentence Classification by Multilingual Data Augmentation and Consensus Learning; multi-task Legal Judgement Prediction Combining a Subtask of the Seriousness of Charges; clickbait Detection with Style-Aware Title Modeling and Co-attention. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#924
17th International Conference on Knowledge Science, Engineering and Management, KSEM 2024
Lect. Notes Comput. Sci. 2024;14887 LNAI(): 2024 Ref ID: 4106 The proceedings contain 160 papers. The special focus in this conference is on Knowledge Science, Engineering and Management. The topics include: EE-LCE: An Event Extraction Framework Based on LLM-Generated CoT Explanation; attention and Learning Features-Enhanced Knowledge Tracing; An MLM Decoding Space Enhancement for Legal Document Proofreading; meta-pruning: Learning to Prune on Few-Shot Learning; knowledge-Informed Molecular Learning: A Survey on Paradigm Transfer; GenFlowchart: Parsing and Understanding Flowchart Using Generative AI; DSCVSR: A Lightweight Video Super-Resolution for Arbitrary Magnification; programming Knowledge Tracing with Context and Structure Integration; an Konwledge-Based Semi-supervised Active Learning Method for Precision Pest Disease Diagnostic; multi-label Feature Selection with Adaptive Subspace Learning; User Story Classification with Machine Learning and LLMs; PTMA: Pre-trained Model Adaptation for Transfer Learning; optimization Strategies for Knowledge Graph Based Distractor Generation; reinforced Subject-Aware Graph Neural Network for Related Work Generation; EFCC-IeT: Cross-Modal Electronic File Content Correlation via Image-Enhanced Text; multi-relation Neural Network Recommendation Model Based on Knowledge Graph Embedding Algorithm; link Prediction Based on Deep Global Information in Heterogeneous Graph; subject Knowledge Entity Relationship Extraction Based on Multi-feature Fusion and Relation Specific Horns Tagging; a Human-Computer Negotiation Model Based on Q-Learning; affine Transformation-Based Knowledge Graph Embedding; integrating Prior Scenario Knowledge for Composition Review Generation; distant Supervised Relation Extraction on Pre-train Model with Improved Multi-label Attention Mechanism; sEMG-Based Multi-view Feature-Constrained Representation Learning; vicinal Data Augmentation for Classification Model via Feature Weaken; STM: An Improved Peak Price Tracking-Based Online Portfolio Selection Algorithm; spatiotemporal Dependence Learning with Meteorological Context for Transportation Demand Prediction; automatic Meter Pointer Reading Based on Knowledge Distillation; multi-table Question Answering Method Based on Correlation Evaluation and Precomputed Cube; a Joint Multi-task Learning Model for Web Table-to-Knowledge Graph Matching; an In-Context Schema Understanding Method for Knowledge Base Question Answering. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#917
13th International Conference on Knowledge Science, Engineering and Management, KSEM 2020
Lect. Notes Comput. Sci. 2020;12274 LNAI(): 2020 Ref ID: 5719 The proceedings contain 85 papers. The special focus in this conference is on Knowledge Science, Engineering and Management. The topics include: A dynamic answering path based fusion model for KGQA; Improving deep item-based collaborative filtering with Bayesian personalized ranking for MOOC course recommendation; online programming education modeling and knowledge tracing; enhancing pre-trained language models by self-supervised learning for story cloze test; MOOCRec: An attention meta-path based model for Top-K recommendation in MOOC; PVFNet: Point-view fusion network for 3D shape recognition; HEAM: Heterogeneous network embedding with automatic meta-path construction; a graph attentive network model for P2P lending fraud detection; an empirical study on recent graph database systems; graph embedding based on characteristic of rooted subgraph structure; bibliometric analysis of twitter knowledge management publications related to health promotion; automatic cerebral artery system labeling using registration and key points tracking; page-level handwritten word spotting via discriminative feature learning; NADSR: A network anomaly detection scheme based on representation; a knowledge-based scheduling method for multi-satellite range system; IM-net: Semantic segmentation algorithm for medical images based on mutual information maximization; fast backward iterative laplacian score for unsupervised feature selection; improving low-resource chinese event detection with multi-task learning; feature selection using sparse twin support vector machine with correntropy-induced loss; customized decision tree for fast multi-resolution chart patterns classification; knowledge graphs meet geometry for semi-supervised monocular depth estimation; predicting user influence in the propagation of toxic information; extracting distinctive shapelets with random selection for early classification; butterfly-based higher-order clustering on bipartite networks; preface. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#916
13th Asian Conference on Intelligent Information and Database Systems, ACIIDS 2021
Lect. Notes Comput. Sci. 2021;12672 LNAI(): 2021 Ref ID: 5630 The proceedings contain 67 papers. The special focus in this conference is on Intelligent Information and Database Systems. The topics include: Entropy-Based Variational Learning of Finite Generalized Inverted Dirichlet Mixture Model; mixture-Based Unsupervised Learning for Positively Correlated Count Data; phase Prediction of Multi-principal Element Alloys Using Support Vector Machine and Bayesian Optimization; VEGAS: A Variable Length-Based Genetic Algorithm for Ensemble Selection in Deep Ensemble Learning; demand Forecasting for Textile Products Using Statistical Analysis and Machine Learning Algorithms; parallelization of Reinforcement Learning Algorithms for Video Games; A Gap–Based Memetic Differential Evolution (GaMeDE) Applied to Multi–modal Optimisation – Using Multi–objective Optimization Concepts; simulating Emergency Departments Using Generalized Petri Nets; what Do You Know About Your Network: An Empirical Study of Value Network Awareness in E-commerce; investigating Crossover Operators in Genetic Algorithms for High-Utility Itemset Mining; a Robust Approach to Employee Competences in Project Management; key Aspects of Customer Intelligence in the Era of Massive Data; how Spatial Data Analysis Can Make Smart Lighting Smarter; convolutional Neural Networks for Web Documents Classification; sequential Model-Based Optimization for Natural Language Processing Data Pipeline Selection and Optimization; automatic Cyberbullying Detection on Twitter Using Bullying Expression Dictionary; development of Morphological Segmentation for the Kyrgyz Language on Complete Set of Endings; empirical Study of Tweets Topic Classification Using Transformer-Based Language Models; a New Approach for Measuring the Influence of Users on Twitter; building a Domain-Specific Knowledge Graph for Business Networking Analysis; complexes of Low Dimensional Linear Classifiers with L1 Margins; N-Tier Machine Learning-Based Architecture for DDoS Attack Detection. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||
|
#910
10th CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2021
Lect. Notes Comput. Sci. 2021;13028 LNAI(): 2021 Ref ID: 5650 The proceedings contain 116 papers. The special focus in this conference is on Natural Language Processing and Chinese Computing. The topics include: Adaptive Transformer for Multilingual Neural Machine Translation; improving Non-autoregressive Machine Translation with Soft-Masking; AutoNLU: Architecture Search for Sentence and Cross-sentence Attention Modeling with Re-designed Search Space; autoTrans: Automating Transformer Design via Reinforced Architecture Search; a Word-Level Method for Generating Adversarial Examples Using Whole-Sentence Information; RAST: A Reward Augmented Model for Fine-Grained Sentiment Transfer; pre-trained Language Models for Tagalog with Multi-source Data; accelerating Pretrained Language Model Inference Using Weighted Ensemble Self-distillation; employing Sentence Compression to Improve Event Coreference Resolution; chinese Macro Discourse Parsing on Dependency Graph Convolutional Network; BRCEA: Bootstrapping Relation-Aware Cross-Lingual Entity Alignment; employing Multi-granularity Features to Extract Entity Relation in Dialogue; attention Based Reinforcement Learning with Reward Shaping for Knowledge Graph Reasoning; entity-Aware Relation Representation Learning for Open Relation Extraction; ReMERT: Relational Memory-Based Extraction for Relational Triples; recognition of Nested Entity with Dependency Information; HAIN: Hierarchical Aggregation and Inference Network for Document-Level Relation Extraction; Incorporate Lexicon into Self-training: A Distantly Supervised Chinese Medical NER; diversified Paraphrase Generation with Commonsense Knowledge Graph; explore Coarse-Grained Structures for Syntactically Controllable Paraphrase Generation; predicting Categorial Sememe for English-Chinese Word Pairs via Representations in Explainable Sememe Space; chinese Poetry Generation with Metrical Constraints; CNewSum: A Large-Scale Summarization Dataset with Human-Annotated Adequacy and Deducibility Level; question Generation from Code Snippets and Programming Error Messages; extractive Summarization of Chinese Judgment Documents via Sentence Embedding and Memory Network; thinkTwice: A Two-Stage Method for Long-Text Machine Reading Comprehension. |
Davis
voted
mohammed afaan
voted
Final decision
What was the agreed final decision?
|
|||
|
|
||||